Skip to main content
Internet Archive's 25th Anniversary Logo

Full text of "Procreedings Of The Fourth International Conference On Operational Research"

See other formats


PROCEEDINGS OF THE FOURTH 
INTERNATIONAL CONFERENCE 
ON OPERATIONAL RESEARCH 


ACTES DE LA QUATRIEME 
CONFERENCE INTERNATIONALE 
DE RECHERCHE OPERATIONNELLE 



PUBLICATIONS IN OPERATIONS RESEARCH 


Operations Research Society of America 

Editor for Publications in Operations Research 

David B. Hertz 



No, 1- Queues, Inventories and Maintenance 

Philip M. Morse 

No. 2. Finite Queuing Tables 
L. G. Peck and K. N. Hazelwood 

No. 3. Efficiency in Government through Systems Analysis 
Roland N. McKean 

No. 4. A Comprehensive Bibliography on Operations Research 
Operations Research Group, Case Institute 

No. 5. Progress in Operations Research, Volume I 
Edited by Russell L. AckofF 

No. 6. Statistical Management of Inventory Systems 
Harvey M« Wagner 

No. 7. Price, Output, and Inventory Policy 
Edwin S. Mills 

No. 3. A Comprehensive Bibliography on Operations Research, 1957-1958 
Operations Research Group, Case Institute 

No. 9. Progress in Operations Research, Volume II 
David B, Hertz and Roger T. Eddison 
No. 10. Decision and Value Theory 
Peter C. Fishbnm 

No. 11. Handbook of the Poisson Distribution 
Frank A. Haight 

No. 12. Operations Research in Seller’s Competition: A 
Stochastic Microtheoky 
S. Sankar Sengupta 

No. 13. Bayesian Decision Problems & Markov Chains 
J. J. Martin 

No. 14. Mathematical Models of Arms Control and Disarmament: 
Application of Mathematical Structures in Politics 
Thomas L. Saaty 

No. 15. Fourth International Conference on Operational Research 
David B. Hertz and Jacques Mel&se 

No. 16. Progress in Operations Research, Volume III: 
Relationship Between Operations Research and the Computer 
J. S. Aronofsky 



PROCEEDINGS OF THE FOURTH 
INTERNATIONAL CONFERENCE ON 


OPERATIONAL 

RESEARCH 


ACTES DE LA QUATRIEME 
CONFERENCE INTERNATIONALE DE 

RECHERCHE 

OPERATIONNELLE 


Organized by the International Federation 
of Operational Research Societies 


EDITED BY 

DAVID B. HERTZ and JACQUES MELESE 

PRESENTES PAR 

DAVID B. HERTZ et JACQUES MELESE 


WILEY-INTERSCIENCE 


A DIVISION OF JOHN WII.EY !c SONS 
NEW YORK • LONDON * SYDNEY • TORONTO 



Copy?, i ght © 1955 by 
tvnojry- FoEstvnoe; or Opzhatt ccc al Research i 


All rights reserved- Xo part of this book may 
be reproduced by any means, nor transmitted, 
nor translated into 2 machine language with- 
out the written permission of the publisher. 

1234567 29 10 


SHABY OF COMGHESS C KT&LOG Card XCMBEBt 68-22304 
SBX 471 37320 X 

PpjtrrzD n; the U: cited States ot America 



EDITORS’ INTRODUCTION 


The Fourth International Conference on Operational Research was held at 
the Massachusetts Institute of Technology in Boston between August 29 and 
September 2, 1966. Conferees came from 14 nations and presented a total 
of more than 300 papers. 

The program for the conference was a varied one that revealed the wide 
range of problems to which operations research is being applied and the impos- 
sibility of placing specific bounds on the field. 

Although “operations research' 1 continues to defy precise definition, 
operations researchers nevertheless have problems in common, and these 
common concerns were the subject of the three principal sessions of the program. 
Session 1 was devoted to advances in techniques of mathematical programming, 
an area subject to continued investigation and one in which perhaps the most 
significant advances have been made since the Third International Conference 
in 1963. Session 2 concerned decision theory, with emphasis on the relevance 
of theoretical decision theory to those who actually make decisions, and the 
similarities of approach of the researcher and the decision maker. Session 3 
considered progress in the area of model building — an area that, given the 
importance of mathematical techniques in operations research, may perhaps be 
defined as the distinguishing characteristic of the operations research approach 
to problems. 

The general sessions were followed by a group of sessions devoted to specific 
techniques and applications — the theory of graphs, marketing, transportation, 
urban planning, investment-policy analysis, scheduling problems, simulation, 
natural resources, and distribution systems. Finally, four additional sessions 
were organized to accommodate what we have called “general contributed 
papers, “ the topics of which did not fit in any of the preceding sessions or which 
considered the problems of the general sessions in greater depth. Informal discus- 
sion meetings were also held during the course of the conference. 

Dr. Charles J. Hitch, now Vice-President of the University of California and 
formerly Assistant Secretary of Defense, gave the keynote address. He spoke 
of'thc Planning, Programming, and Budgeting System of the U.S. Depart- 
ment of Defense, and his address thus concerned studies at the national level. 
Although P.P.B.S, may not have been born under the banner of operations 
research, it nevertheless is related to operations research in its method of analy- 
sis, its objective of quantification and cost-benefit analyses, and its goal of 
objective control of complex organizations. 

Because of the abundance of papers presented, the editors found it necessary 
to ask some contributors to shorten their contributions and others to present 
in this volume only an abstract in those cases in which the subject of the papers 
was least related to the purposes of the conference or in which only conclusions 
or work in progress were reported. 



VI 


EDITORS* INTRODUCTION 


The editors owe special thanks to Annelise Anderson of Columbia University 
for unremitting editorial and supervisory assistance during the preparation of 
this volume. The Operations Research Society of America generously provided 
the basis for its publication by including it in its Publications in Operations 
Research Series, 

On the whole, the increasingly sophisticated understanding of the nature 
of models that realistically can reflect the economic, physical, and social 
structures of our times is revealed in the various papers contained in this 
volume. It records one additional significant step in the international intel- 
lectual adventure called operations research. 


New York 
Paris 


David B. Hertz 
Jacques Melese 



INTRODUCTION DES EDITEURS 


Le 4£mc congrcs international de Recherche Operationnelle s'est tenu a Boston, 
du 29 aout au 2 septembre 1966 dans Ie cadre du Massachusetts Institute of 
Technology. Lcs participants qui venaient de 14 pays differents, ont presente 
plus de 100 communications. 

Le programme du congres etant tres varie, manifestant la grande diversite 
des problemes qui sont abordes par la Recherche Operationnelle et Timpos- 
sibilite de delimiter avec precision son champ d 'application. 

Bien que la Recherche Operationnelle continue a echapper a toute definition 
limitative, les cherchcurs operationnels ont de nombreux problemes communs. 
Les trois sessions principals du congrcs avaient pour theme de tels problemes. 

La session 1 dtant ccntrec sur lcs novations dans les techniques de program- 
mation mathematique, domaine qui fait l’objet de recherches continucllcs et, 
probablement, un de ceux ou les progres les plus significatifs ont ete realises 
depuis le 3eme congres international de 1963. 

La session 2 conccmait la theorie de la decision, dans l’optique particuliere 
des rapports entre la theorie et la pratique et de la comparison entre les 
approches des cherchcurs et celles des decideurs. 

La session 3 ctait centrce sur les progres recents en matierc de modelisation; 
vu l’importance des techniques mathematiques en Recherche Operationnelle, 
ce theme rccouvre ie caractcre le plus remarquable de la methodologie de la 
Recherche Operationnelle. 

Ccs trois sessions' generates ctaient accompagnees par un ensemble de 
sessions consacrecs aux techniques particulieres et aux applications: theorie 
des graphes, marketing, transports, planning urbain, politique d’investissement, 
ordonnanccment, simulation, ressources naturelles, s)^stemes de distribution. 
Dc plus, quatre sessions complementnires ont etc organisees pour presenter 
cc que nous avons appcld lcs “Contributions Gcneralcs”: il s’agissait de com- 
munications dont 1c sujet n’entrait pas dans le cadre des sessions preeddentes, 
ou encore qui traitaient les problemes dcs sessions gdnerales sur un plan tres 
technique. Rappclons enfin que de nombreuses reunions informelles ont ete 
organisees pendant la dur^c du congrcs. 

Lc Dr. Charles J, Hitch, actuel vice-president de TUniversite de Californie et 
prccedcmment Secretaire Assistant a la Defense, a prononce Pallocution 
d’ouvcrture, II a presente le “Planning, Programming and Budgeting System” 
du di*partcmcnt de la defense des Etats-Unis, sc situant ainsi au niveau dcs 
etudes nationalcs. Bien que lc P.P.B.S. n’ait pas vu lc jour sous la bannierc 
dc la Recherche Operationnelle, il en cst cependant tres voisin par sa methode 
d'analysc, ses objcctifs dc quantification et d’analysc cout-benefice, enfin son 
but dc controle dcs organisations complexes. 

Comptc tenu dc Tabondancc des communications presentees, les editcurs 
ont juge necessnirc de demander a certains auteurs dc raccourcir leurs papiers 

vii 



Viii INTRODUCTION DES EDITEURS 

et, a d’autres, de ne presenter dans ce livre, qu’un resume: ce dernier cas con- 
cerne les communications dont le sujet etait quelque peu eloigne des themes 
principaux du congres ou encoure celles qui presentaient seulement des travaux 
en cours. 

Les editeurs sont particulierement reconnaissants a Madame Annelise 
Anderson de la Columbia University pour la coordination des travaux d’edition 
de cet ouvrage. La Societe de Recherche .Operationnelle d’Amerique (Opera- 
tions Research Society of America), en y incluant sa s6rie de documents sur la 
Recherche Operationnelle, a genereusement fourni une raison pour sa publi- 
cation. 

En conclusion, on peut affirmer que les communications rassemblees dans 
ce livre revelent Tapprofondissement constant des modeles susceptibles de 
representer les structures economiques, physiques et sociales de notre epoque. 
Ce 4eme congres manifeste un progres notable dans l’entreprise intellectuelle 
internationale nommee Recherche Onerationnelle. 



WELCOME FROM THE HOST SOCIETY 


John F. Magee 

President , Operations Research Society of America 

Ladies and gentlemen, it is a distinct honor and great personal pleasure to 
welcome you, the delegates to the Fourth International Conference on Opera- 
tional Research sponsored by the International Federation of Operational 
Research Societies, on behalf of the host society, the Operations Research 
Society of America. We are very pleased that you could come and arc honored 
by your presence in Cambridge. Those of us who have tasted the hospitality, 
the friendship, and the vigor of discussions at Oxford, Aix-en-Provence, and 
Oslo hope that you will find Cambridge and New England equally attractive, 
interesting, and hospitable. Those of you who are familiar with the Boston area 
will bear me out when I say that this glorious weather we are having is charac- 
teristic, provided by the Committee and the Institute as a suitable backdrop of 
the meeting. 

We have much to do in the next few days; there are many interesting tech- 
nical discussions, consideration of the policies, practices and future activities 
of the Federation, and renewals of old friendships formed over the last decade. 
Therefore, I will be brief; however, I thought you would be interested in some 
of the characteristics of this meeting. 

Mr. Martin Ernst, our General Chairman, tells me that wc have 347 
delegates to the meeting from 25 countries representing every part of the world. 
I need not remind you of the magnitude, complexity, and depth of the technical 
program. Those who have attempted to read, or even weigh, the preprints know 
the magnitude of the program, but I think that the breadth and depth of the 
program for this meeting form a testament to the vision of the founders of the 
Federation of the vigor of the operations research movement throughout the 
world. I think it is fitting, indeed, that this Fourth Conference, which inaugu- 
rates the tenth year of the activities of the Federation, should return to the 
Massachusetts Institute of Technology. I use the word “ return ” advisedly; 
for M.I.T. has played a remarkable if not unique role among institutions of 
the world in the operations research movement. Her alumni are among the 
most distinguished members of the Operations Research Society of America 
and are active in the societies throughout the world. Her faculty has played a 
part in the development of education for operations research and has encouraged 
and fostered work in the field throughout the world. The Institute itself has 
been a leader in this country in the development of education and research 
in operations research from the beginning. Therefore it is a distinct pleasure 
to all of us, I believe, to be able to accept the very gracious hospitality that the 
Institute has provided this Conference. In view of this tradition and hospi- 
tality, I consider it a distinct honor and pleasure to introduce the President 
of the Massachusetts Institute of Technology, Dr. Howard Johnson. 


IX 



WELCOME FROM M.I.T. 


Dr. Howard Johnson 

President , Massachusetts Institute of Technology 

Mu. Chairman, it is a great pleasure for me to extend a very personal welcome 
to all of you to M.I.T., to all of the members of this Fourth International 
Conference on Operational Research and the first one in this country. I hope 
that your meetings here during these few days at M.I.T. are both productive 
in extending the communication of ideas and approaches and in the develop- 
ment of friendly collaboration in this most significant and still new area of 
work. In some ways it is true that the language of operations research is one of 
the more universal languages today. I don’t know whether this stems from 
the mathematics that form the fundamental basis of the approach or because the 
problems of decision making to which operations research is being applied 
are so commonly pressing throughout the world, but whether it is in Europe 
or North America, Russia or India, Latin America or Japan, I have found 
operations research applications to management, for example, just to use that 
field, second in interest only to concern about the human being and his adjust- 
ment to an industrial world. As many of you know, you will find M.I.T. an 
appropriate and most sympathetic site for this conference. In a sense your 
chairman is right in welcoming you back to M.I.T. in a spiritual sense. It is a 
fact that M.I.T. was one of the first universities to recognize officially the field 
of operations research as an appropriate academic area for research and study. 
In 1951 the first summer course in operations research attracted a lot of atten- 
tion, and even before that there were courses in operations research being 
conducted by Professor Wadsworth in the Mathematics Department, and, of 
course, you know a good many of the great names associated with this general 
field here. Especially, you know the effective work of Professor Morse of the 
Operations Research Center, and he has been joined over the years by a large 
number of our faculty members and by students from several departments. 
Today professors from civil, electrical, and mechanical engineering, from 
mathematics and physics, and from economics and management participate 
both in the research and the teaching associated with operations research. 
The scope of the study and the definition of the purposes of operations research 
have expanded considerably since those early years of course work here. From 
relatively simple problems of inventory control and production scheduling to 
the larger problems of analysis, the larger systems of firms in the economy, 
the techniques of operations research have continued to attract interested 
students of all fields. The theories of gaming, organization, decision making, 
and artificial intelligence have brought an ever- widening complexity and 
interest to the approaches of operations research, and it is possible now to sec a 
very effective beginning in adapting operations research approaches to wider 


XI 



WELCOME FROM M.I.T. 


xii 

considerations of systems design in the fields of industrial organization and 
of public service. I find such extension very constructive. But there remains 
a great deal to be done in both the application of existing tools and in the more 
fundamental development of the underlying theoretical constructions in the 
field if the field is to be advanced still further. Too often in my experience has 
the system under study been too complex in its adaptive and dynamic qualities 
to be susceptible to attack by the operations researcher, and his application of 
operations research analysis has shown that the analyst did not understand 
the nature of the problem being studied. I would agree, in short, that there 
is a great deal yet to be done, and I am optimistic, as you are, about the future 
of developing these approaches to increasingly difficult problems. If we can 
continue to recognize that basic research in these areas is as productive as 
research in other fields of science, we shall be able to continue our progress 
in these important fields of whole system solution. 

Once again, I welcome you to M.I.T. and I took forward with you to a most 
effective conference. Thank you very much. 



CONFERENCE OPENING 
ALLOCUTION D’OmTRTURE 

M. Boiteux* 

Secrciary of IFORS 

Is opening the Fourth International Conference on Operational Research, 
may I take this occasion first to thank the President of the Massachusetts 
Institute of Technology for the hospitality that the Institute has offered us. 
Mr. President, I thank you on behalf of all the delegates of our member 
societies who are here today, on behalf of their societies, and on behalf of all 
our guests. For many delegates who are in this room, including some of our 
American friends, it is the first time we have entered this prestigious institute 
whose reputation is world-wide. More than one hundred years old, M.I.T. is 
still young and dynamic because it has gathered a first-class corps of professors, 
because it has always kept abreast of the progress of knowledge, and because 
it has been among the first to develop new technology such as operations 
research, which, of course, is of popular interest to us. This new technology 
allows the students of your Institute to have access to the most modem tech- 
niques. It is therefore an honor for our Federation to be welcomed this sum- 
mer within your walls. We are very grateful to you, Mr. President, for having 
accepted us and for having welcomed us in person. In my thanks I could not 
forget Professor Morse, former secretary of our affiliation, under whose sponsor- 
ship the present conference is being held. It is he who had the idea to gather 
us together this year at M.I.T. , and we owe him the homage due to him, in 
addition to all the reasons we have for showing him how grateful we are for the 
wisdom and efficiency he has shown during the three-year period he spent in 
the interests of the Federation. Operations Research Society of America 
(ORSA), acting as host society, has undertaken the organization of this con- 
ference, and I thank its President, who so keenly welcomed us at the begin- 
ning of this session, and all those under him who have worked to make a success 
of this enterprise, particularly Martin Ernst, who has presided with hospitality 
and efficiency over the Program Committee, and William Marcuse, who has 
presided over the Local Committee. I should now like to welcome the new 
societies that are going to become members of our affiliation: the Hellenic 
Operations Research Society, The Operations Research Society of Ireland, 
and The Mexican Society of Operational Research and Management Science, 
all organizations whose representatives are with us today. In other nations, 
also, operations research societies and groups are organizing to promote opera- 
tions research in their countries. As your secretary', I have invited representa- 

* President du Comite Consultant de la Recherche Scientifique et Technique (France). 
Dircctcur d’EIectridtc de France. 

xiii 



CONFERENCE OPENING 


xiv 

tives from those countries to participate in our conference. This will allow 
them to get to know us better and allow us to get to know them better, and we 
shall be able to see what help the affiliation can give them. I am happy to 
welcome, in the names of all of you, all those who have been able to accept 
this invitation to take part in our debates. Finally, among our guests I should 
like to salute the representatives of the great international organizations, the 
United Nations, UNESCO, the Organizations for Economic Cooperation 
and Development, OSED, and NATO for their efforts and the help they have 
given us in promoting operations research. Having arrived at this point of 
my speech, I am convinced that it will be easier for most of you to understand 
my French; let us try. 


Permettez-moi tout d’abord, Monsieur le President, au moment d’ouvrir la 
4eme Conference Internationale de Recherche Operationnelle, de vous remercier 
de PhospitalitO que nous offre le M.I.T. Ces remerciements, je vous les adresse 
au nom de tous les delegues de nos societes membres qui sont aujourd’hui 
presents, au nom de ces societes elles-memes, et au nom de nos invites. 

Pour beaucoup des delegues qui sont dans cette salle, y compris certains de 
nos amis americains, c’est la premiere fois que nous penetrons dans ce prestigieux 
Institut. Plus que centenaire maintenant, le M.I.T. a su conserver jeunesse et 
dynamisme en rassemblant un corps professoral de premier plan, en s’attachant 
a rester a la pointe des progres de la connaissance, et en developpant parmi les 
premiers les enseignements nouveaux — tel celui de la Recherche Operation- 
nelle, auquel nous sommes, evidemment, tout particulierement interesses — 
qui permettent aux etudiants de votre Institut d’acceder rapidement aux 
techniques les plus modernes. 

C’est done un honneur pour notre Federation que d’etre accueillie, cet 
ete, dans vos murs; nous vous sommes tres reconnaissants d’avoir accepte de 
nous recevoir, et d’avoir bien voulu, Monsieur le President, venir vous-meme 
nous souhaiter la bienvenue. 

Je ne saurais oublier, dans ces remerciements, le professeur Ph. Morse, 
precedent Secretaire de notre Federation, sous le patronage duquel est placee 
la presente conference. C’est lui qui a eu l’idee de nous reunir cette annee au 
M.I.T.; nous lui devons Phommage du a Pinventeur — outre toutes les raisons 
que nous avons de lui manifester notre reconnaissance pour la sagesse et Peffi- 
cacite qu’il a manifestoes, au cours du precedent triennat, a la tete de notre 
Federation. 

C’est PORSA qui, en tant que societe hote, a pris en charge Porganisation 
de cette conference. J’en remercie son president, qui nous a si aimablement 
accueillis au debut de cette premiere session et tous ceux qui, autour de lui, 
se sont attaches au succes de cette entreprise: en particulier, Martin Ernst, 
qui a preside avec autorite et efficacite le Comite du programme, et William 
Marcuse qui presidait le Comite local. 

Me tournant maintenant vers Pauditoire, je voudrais souhaiter une bien- 



ALLOCUTION D'OUVERTURE XV 

venue particuliere aux nouvelles societes qui vont devenir membres de notre 
Federation: 

— la societe grccque de R.O. (HELORS), 

— la societe irlandaise (ORSI), 

— ct la societe mexicainc, 

dont les representants sont aujourd’hui dans la salle. 

Dans d’autres nations aussi, des societes de R.O. se developpent ou sont 
cn projet, des groupcs s’organisent qui veulent promouvoir la R.O. dans leur 
pays. Au nom de notre Federation, j*ai invite des personna lites de ces pays a 
participcr a notre Conference; cela leur permettra de mieux nous connaitre; 
ccla nous permettra aussi de mieux les connaitre, et nous pourrons voir ensemble 
quelle aide nous pouvons leur apportcr. Je suis heureux de souhaiter la bicn- 
venue, cn voire nom a tous, a ceux qui ont pu accepter cctte invitation k parti- 
cipcr a nos debats. 

Parmi nos invites, je voudrais saluer enfin les reprdsentants des grands 
organismes internationaux, ONU, UNESCO, OCDE, OTAN, qui nous font 
1’honneur dc s’interesser a nos efforts, et nous aident fort utilement a promou- 
voir la R. O. & travers le monde. 

Nous fetons aujourd’hui un anniversaire: il y a vingt-cinq ans, en effet, 
qu’est nee en Anglctcrre, dans un groupe de l’armee de 1’air, I’expression 
“Recherche Operationnelle”. Sans doute la chose existait-elle deja sous des 
formes diverses et plus ou moins conscientes, mais lc nom n’existait pas encore: 
e'est le 25eme anniversaire dc notre bapteme qui marque cette conference. 

Au cours de ces 25 ans, nombre d’efforts se sont deployes, partout, pour 
promouvoir la R.O. ct ces efforts ont etc largement couronnes de succes. 

Quant a notre Federation, elle aura 10 ans l’annee prochaine. Cette Federa- 
tion, dc conference cn conference, tous les trois ans, elle grandit. A Aix-en- 
Provcncc, en 1960, nous groupions dix societes qui recouvraient environ 5500 
personnes. A Oslo, cn 1963, 17 societes membres representaient 8700 adhe- 
rents. A Boston, aujourd’hui, une fois achevee la procedure d’admission des 
nouvelles socidtds que nous allons accueillir, nos 20 societes recouvriront pres 
de 12,000 personnes. 

Ces societes sont actives, organisent des reunions, collaborent a Tedition 
des “ International Abstracts of Operational Research,” ct nombre d’entre 
cllcs editent un Journal, ou une Revue, destine en premier lieu a leurs membres, 
mais qui, grace a notre Federation, peut acquerir plus facilement une audience 
Internationale. 

Ainsi l’cmprise dc notre Federation s’etend, en meme temps qu’clle s’appro- 
fondit gfftcc aux travaux menes par les membres de nos societes dans les 
branches d’activite les plus diverses. 

Mais il nous rcste encore bcaucoup a faire, car ce sont bicn souvent les pays 
ou n 'existent pas dc societes de R.O., ni meme peutetre de chercheurs opera- 
tionnels, qui ont le plus besoin de Taidc que peuvent leur apportcr les experts 
des pays plus avances. 



XVI 


CONFERENCE OPENING 


Notre precedent secretaire, Ph. Morse, nous a rappele a Oslo, il y a trois ans, 
toutc Fimportance que rcvetait pour notre Federation cet aspect de notre 
mission — et je ne saurais trop souligner Factivite qu’il a lui-memc deployee, 
et qu’il continue a deployer dans cette direction. 

Mais je voudrais insister aussi sur le devoir que nous avons de participer 
activement a la resolution des grands problemes qui se posent dans nos pavs, 
cn contribuant a mettre au point des methodes plus rationnelles de decision. 
Je pense tout specialement a deux grands themes d’actualitc: 

— la recherche scicntifique et technique, dont depend notre essor materiel 
(et culture!) 

— la maitrise des agglomerations urbaines, ou vivent et vivront de plus en 
plus la grande majorite de nos contemporains, mais ou la vie devient 
de plus en plus difficile. 

La recherche tout d’abord. Les depenses qui y sont consacrees, chaque annee, 
dans les pays les plus developpes, atteignent 2 a 3 % du produit national. 

Mais il nous faut bien reconnaitre que les criteres qui conduisent a fixer 
rimportance de cet effort, et ceux, surtout, qui determinent sa repartition entre 
les diverses disciplines, branches d’activite ou objectifs, sont encore tres 
pauvres. 

Et les modalites memes de gestion des activitcs de recherche relevent 
largcment d’un empirisme, certes etaye par le bon sens, mais tres peu encore 
par une reflexion system atique. 

N’est-il pas frappant de penser qu’une part aussi importante de Factivite 
nationale echappe aussi largcment a des processus de decision rationalises? 

Le terrain n’est certes pas vierge, et des travaux deja notables ont ete 
accomplis dans ce domaine de la “ recherche sur la recherche. ” Mais les cher- 
cheurs operationnels ont certainement la un role a jouer, comme le montrent, 
d’ailleurs, les quelques communications prevues sur ce theme pour les prochains 
jours. Je peux vous dire en tous cas, par experience personnelle, combien les 
responsables nationaux de la politique de recherche scientifique et technique 
ressentent le besoin d’asseoir de fason a la fois plus objective et plus efficace les 
choix qu’il leur faut bien faire d’une maniere ou d’une autre. 

Si tout le monde s’accordc aujourd’hui a voir dans la Recherche le motcur 
le plus puissant de la croissance et du succes dans la competition interna- 
tionale, il est une autre commotion qui se repand aussi, a Toccasion de la dure 
experience quotidienne, que les agglomerations urbaines dans lesquelles 
une part croissante de 1’humanite est appelee a vivre ne sont plus vivables. 

Il y a un certain temps deja que les chercheurs de toutes disciplines s’atta- 
quent a ce probleme. Des progres ont ete fails dans l’etude et la maitrise de 
la circulation et des transports; la pollution de Fair est a Fordre du jour, et 
celle des eaux suscite une attention croissante. On s'attache a concevoir des 
villes nouvellas, ou des villes nouvelles, ou des banlieues, mieux adaptees aux 
donnees de notre epoque que celles dont nous avons herite. Mais le mal con- 
tinue a aller plus vite que le remede pour une proportion croissante des urbains, 
et nous sommes menaces, dans une opulence materielle accrue, de ne plus 



ALLOCUTION D’OUYERTURE xvii 

pouvoir preserver ces blcns immatericls essenticls que constituent notre temps 
et notre sante. 

Sociologic, psychologic, econometric, medecine, physique, chimic — et 
csthetique, la plupart ties disciplines sont conccrnecs, tandis que l’imagination 
des technicicns est sollicitcc dans !es voies les plus divcrscs. La aussi, les 
methodes dc la recherche operationnclle doivent contribucr puissamment a 
unc mcilicurc resolution dc ces problemes; on leur doit deja, d’ailicurs, dcs 
apports notables notamment dans ce pays, et vous avez ccrtainemcnt con- 
state qu'une session cnticre du programme de notre conference est consacree a 
ce theme. 

En vous parlant plus specialement, dans cctte allocution inaugurate, dc la 
Recherche et des Villcs, je n’ai certcs pas voulu minimiser 1’interet de 1 'effort 
que la plupart d’entre vous developpcnt dans les domaincs industriel, agricole, 
commercial ct militaire; la encore, nous avons beaucoup a faire. Et je vous 
rappellerai enfin, que les pays cn voie dc devcloppcmcnt ont aussi grand besoin 
de nous. 

Confronted a toutes ces tachcs, il importc que nous afhnions nos methodes, 
que nous echangions les experiences acquises par chacun dc nous dans ces 
divers domaines, et que nous fassions progresscr nos techniques. C'est pour 
ccla que nous sommes reunis cette semaine a Boston — outre la plaisir dc se 
retrouver entre amis. Je suis sur que nous y reussirons, et que cet enrichisse- 
ment intellectucl accompagnere dans nos memoires le souvenir des heureuses 
journccs passecs ensemble a Toccasion de cctte Quatrieme Conference de 
notre Federation. 



KEYNOTE ADDRESS 
“ WHITHER PROGRAM BUDGETING?” 

Dr. Charles Hitch 

Vice President of the University of California 

Delegates and guests. Since Mr. MacNamara became Secretary of Defense 
in January 1961, a number of new management techniques have been intro- 
duced in the Department of Defense. Some of these are closely associated with 
the planning and budgeting activities of the Department and have become 
known as “PPB” — the Planning, Programming, Budgeting system. Eleven 
months ago, in September 1965, President Johnson announced that he wanted 
all departments and agencies of the Federal Government to introduce PPB 
systems similar to that of the Department of Defense. In October 1965, just 
ten months ago, the Bureau of the Budget issued to department and agency 
heads its famous Bulletin 66-3 of instructions for establishing such a system in 
each department. Because I think that these developments arc of great signifi- 
cance for the operations research profession, I should like to do two things this 
morning in the hour at my disposal: first, sketch the development of the PPB 
system in the Department of Defense and explain the rationale of the manage- 
ment techniques it introduced and, second, outline some of the problems and 
risks ns well as the opportunities in attempting to extend the use of these tech- 
niques rapidly 4 and universally throughout the Government. 

What arc the management techniques that constitute this PPB? There are 
two, which are related and mutually supporting but distinct — in fact so distinct 
that it is possible to use either without the other. One of these techniques is 
called “program budgeting” or, more simply, “ programming.” Since program 
budgeting is sometimes used more broadly to mean the whole PPB system, 
I shall use the simpler term programming to describe this part of the system. 
Programming as an activity produces a program or program budget with the 
following characteristics: first, it is organized or classified by programs rather 
than, as traditional budgets are, by objects of expenditure, or, if you prefer, 
it is classified by outputs rather than “inputs.” Second, the resource require- 
ments and the financial implications are linked to these programmed outputs. 
Third, the program extends far enough into the future to show to the extent 
practical and necessary the full resource requirements and financial implications 
of the programmed outputs. In the Department of Defense programmed 
outputs arc usually shown for eight years and the financial implications for 
five years. 

The second of the two management techniques in PPB is variously named 
“systems analysis,” “cost effectiveness analysis,” or “cost benefit analysis,” 
as well as by various other names, including operations research. Our profes- 
sion seems to be singularly plagued by terminological confusion. Let me call 


xix 



XX 


KEYNOTE ADDRESS 


it systems analysis this morning, since that is its official name in the Department 
of Defense. Systems analysis in this sense is analysis, explicit quantitative 
analysis to the extent practical, that is designed to maximize or at least increase 
the value of the objectives achieved by an organization minus the value of the 
resources it uses. 

These two techniques, programming and systems analysis, were introduced 
into DOD by Secretary MacNamara for one purpose, namely, to improve 
high-level planning in the Department, that is, planning at the level of Depart- 
ment of Defense headquarters, sendee headquarters, and the headquarters of 
the great unified and specified commands. Other management functions in 
the DOD, such as control and operations, were not affected except very 
indirectly by these particular MacNamara innovations. Even the format of the 
annual operating budget as appropriated by Congress and accounted for by the 
Department’s accounting staffs was unaffected. There is an element of irony 
in DOD getting credit for an innovation that is frequently called “program 
budgeting” when in fact we didn’t make any significant change in what we 
call the budget in the DOD. Instead, and this I think proved to be satisfactory 
enough, we developed a torque converter for converting the five-year program 
into the budget format and vice versa. 

I emphasize the exclusive relation of these techniques to the planning 
function for clarity in explaining their rationale, certainly not to disparage 
them, for I consider planning in its various aspects to be the important function 
of top management in any large organization, whether government, business, 
or education. Before saying more about the techniques let me make some 
general remarks about the nature of planning. The planning function can 
be analyzed in a number of different ways — first, of course, by how distant the 
future time period is with which it is concerned. We have short-range planning, 
that is, planning for the use of existing facilities and resources. We have 
intermediate range planning — the planning of procurement and construction 
of new facilities — and we have long-range planning — the planning of new 
developments with very long lead times — like new major weapons systems 
in defense or like new campuses for the University of California. In defense 
we generally found a ten-year planning cycle long enough for most of our 
developments. At the University of California the lead times are longer. New 
campuses require that we look 35 years ahead to the year 2000. 

Another distinction which is critical to much of my discussion this morning 
is between substantive planning and fiscal planning. Fiscal planning is the 
planning of future budgets — ho* much money and how to spend it. Sub- 
stantive planning is the planning of objectives, both ultimate and intermediate 
on the way to the ultimate. In the Department of Defense substantive planning 
is called military planning; at the University it is called academic planning. 
Both fiscal and substantive planning can be short, intermediate, or long-range. 

Basically, the reason we introduced the two techniques of programming 
and systems analysis into the Department of Defense in 1961 was to improve 
the exercise of the planning function, which we found in disarray. We intro- 
duced programming to make the military planning of the Department realistic, 



KEYNOTE ADDRESS 


XXI 


to make it face up to the hard choices by linking it to fiscal planning from which 
it had been entirely divorced, and \vc introduced systems analysis to provide a 
criterion or standard for making those hard choices, to achieve some rationality 
and optimality in the planning. 

When I say that planning was in disarray at the beginning of 1961 I mean 
just that. There was plenty of planning activity of all sorts; short-range, 
intermediate-range, long-range, substantive, and fiscal. The key to the disarray 
was the complete separation, almost complete separation, between substantive 
or military planning and fiscal planning. These two types of planning were 
performed by two different groups — the military planning by the Joint Chiefs 
of Staff and the Joint Staff and the military planners in the services and fiscal 
planning by the civilian secretary and the comptroller organization throughout 
the Department. Second, these two types of planning were couched in different 
terms, not readily translatable and in general not translated. Mill tar}* planning 
was in terms of army divisions, navy ships, fighter-aircraft squadrons, and so 
forth — military units, weapons systems, the outputs of the department. Fiscal 
planning was in terms of budget categories, which were military personnel, 
operations and maintenance, procurement, research, development, test and 
evaluation, military construction — input categories. In practice, the long- 
range and intermediate-range military plans of the Joint Chiefs of Staff and 
the Services were either not costed out in terms of their budget requirements 
or this was done so roughly and unreliably as to be unusable. Third, the two 
types of planning were for different time periods. There were intermediate- 
range and long-range military plans but no fiscal plans extending beyond the 
next budget year. 

In consequence, the intermediate-range and long-range military planning 
was largely ineffective. The Department of Defense had no approved plans 
extending more than one year into the future. Each year the Joint Chiefs of 
Staff would produce its massive intermediate-range plan called the “Joint 
Strategic Objectives Plan,” the JSOP, extending five to ten years into the 
future, and would send it to the Secretary of Defense, who would note it and 
file it. Before MacNamara no JSOP was ever approved. Then in the budget 
season, in October and November, the real-life decisions were made by civilian 
secretaries advised in the main by the comptroller organization. Why was 
the JSOP ignored? Primarily because it was financially infeasible. It was 
essentially a pasting together of the wish lists of the four military services. If 
costed out, the budgets it required would be far in excess of what any Secretary 
of Defense or President or Congress would approve. The system, in short, 
did not require the military planners to face up to the hard choices that are 
part of responsible management. Let me emphasize that this was not the 
fault of the military planners but of the system. In organizations with similar 
systems academic and business planners act just like military planners. 

But, since the military planners didn’t make the hard choices, the civilian 
secretary had to as best he could in his budget review and without much help 
from intermediate-range or long-range military plans. The method he used in 
his budget review, lacking any other, might be described gencrically as the 



KEYNOTE ADDRESS 


XXllt 


of the budget category in which the funds are appropriated. The total dollars 
required for the program each year are within limits that the Secretary of 
Defense considers appropriate and feasible. The program shifts the emphasis 
from costs in next year's budget to costs to complete and operate a weapons 
system or program. 

The program, once established in 1961, is continuously in being. There is 
always a program, an approved program, but a program-change procedure 
results in several billion dollars worth of changes in the program each year. 
Any office of the Department of Defense may propose a change in the program. 
All major changes have to be approved by the Secretary of Defense after review 
and recommendations by the Joint Chiefs of Staff. So we end up with a plan- 
ning, programming, budgeting system in which the program links the military 
plans on the one side and the budget on the other. 

The function of the planning in the planning-programming-budgeting 
system is to develop alternatives, better alternatives, to those that are embodied 
in the currently approved program. The planning is carried out at all levels 
of the Department and it takes three forms. One of these is the more or less 
traditional military planning such as that embodied in the JSOP, which con- 
tinues. The second is systems analysis, about which I will say a little more later, 
and the third consists of blends of the two. 

The budget, the annual budget, has become in effect the first annual slice of 
the five-year program. The annual budget review continues, but it has become 
an intensive final analysis of the financial requirements of the program for the 
next fiscal year rather than a review of the program itself. 

I am sure that I do not need to say so much to this audience about the second 
of the management techniques in the PPB system, which is called “systems 
analysis,” “cost effectiveness,” or “cost benefit analysis,” or operations 
research. It is nothing more or less than economic analysis applied to the 
public sector. Economic analysis is concerned with the allocation of resources. 
Its basic maxim is: maximize the value of objectives achieved minus the value 
of the resources used. In business this reduces itself to maximizing profits, 
both income and outgo being measured in dollars. In Defense and generally 
in the public sector we lack a common valuation for objectives and resources 
and therefore have to use one of two weaker maxims — maximize objectives for 
given resources or minimize resources for given objectives. This is what a 
systems analysis attempts to do — to choose weapons systems and modes of 
operating them which maximize some military objective or objectives, for 
example, the number of attacking bombers or missiles shot down for given 
resources, or budget dollars, available. The function of the program is to 
cost out the plans to keep them feasible and realistic, to make the planners 
face up to the hard choices. The function of systems analysis is to get dollars 
into the calculations at an earlier stage — into the planning process or into 
the evaluation of alternative ways of achieving a military objective. You can't 
choose the optimal way or even a good way without knowing both about the 
alternatives — what the alternatives achieve, and what they cost. Prom small 
beginnings, which long antedated MacNamara, the use of systems analysis 



XXIV 


KEYNOTE ADDRESS 


has been rapidly expanded until it has become a vital part of the planning and 
decision-making process in the Department of Defense. Since last September 
systems analysis has become the sole function of an Assistant Secretary of 
Defense. 

So, the program provides the link between planning and budgeting, relating 
forces and their costs to national security objectives, whereas systems analysis 
provides the quantitative analytical foundation in many, but by no means all, 
areas for making sound choices among the alternative means of achieving the 
objectives. Between them they give the Secretary of Defense the tools that are 
necessary for planning a program with balance and some rationality and there- 
fore for the unified management of his $60-billion-a-year department. For the 
first time the Secretary of Defense is able to exercise the authority given him in 
the National Security Act of 1947, as amended, which attempted to unify the 
Military Services. 

I have spent so much time explaining what happened in the Department 
of Defense that I am going to take much less time to answer the question 
“whither.” This is perhaps as well, for I know more about the past than 
about the future. Let me speculate with some shorthand points. First, all 
large organizations, whether government, business, or mixed, have many 
problems in common. I am very impressed with their similarities, having 
recently moved from one large organization to another which sounds very 
different but which has many of the same problems. Among them is the prob- 
lem of achieving realistic, balanced, rational plans. So I am sure that similar 
techniques have widespread application in other organizations. 

Second, in fact they already have widespread application. The Department 
of Defense is not the first organization to develop a fiscal plan or program that 
extends more than a year into the future and that has evolved budget categories 
more suitable for planning — for intermediate- and long-range fiscal plan- 
ning — than objects of expenditure. Other organizations have confronted and 
more or less satisfactorily solved the problems of unrealistic and too decentral- 
ized planning. Similarly, many well-managed businesses have made explicit 
quantitative economic analyses of, for example, alternative equipment and 
facility plans, which are indistinguishable from what is called systems analysis 
in the Department of Defense, and operations researchers like yourselves have 
assisted military, other governmental, and business planners with varying 
degrees of success for the last 25 years. What is different perhaps in the Depart- 
ment of Defense is that their systems analysis has become a generally accepted 
way of life, perhaps for the first time in any large public organization. 

Third, there are risks and dangers as well as opportunities in trying to 
move too far too fast in the application of new management techniques like 
these, not least the risk of discrediting the techniques. Although it did not 
appear easy at the time, there is no doubt in my mind that the Department of 
Defense (or much of it) is easier to program and to analyze quantitatively than 
many areas of civilian government; for example, it is easier than the foreign 
affairs area in which I have perhaps foolhardily been attempting to advise 
the State Department on how to install a planning-programming-budgeting 



KI VNOTT ADDRf^S 


X\V 


system. And quite apnrt from ease or difiiculty the substantive problems in 
other areas are different and new. In Defense we had several hundred analysts 
at the RAND Corporation alone, many others elsewhere, developing pro- 
gramming and sj stern-analysis techniques for a decade before Defense attempted 
any large-scale general application. No remotely similar preparatory effort 
has gone into any other go\ ernment.il area, and the number of trained and 
skilled people is so limited that they are inevitably spread far thinner in other 
departments of go\ernmcnt than they were and are in Defense. 

Fourth and finally, if I may end on an encouraging note, although these 
techniques are mutually supporting, we are not dealing here with a matter of 
cithcr/or. There is an infinity of degrees. Not only may one introduce a pro- 
gram budget without systems analysis, or vice vetsa, but each may be used in 
limited areas or ways and sometimes quite productively; for example, in 
foreign affairs, in which quantification of objectives and therefore full systems 
analysis is so difficult, one can, I think, organize the budget more meaningfully 
for planning purposes. In many areas a systems cost analysis is possible and 
useful, although a full systems analysis, involving measurement of objectives, 
is not as yet. In many areas in which the more or less grand optimizing of 
systems analysis is not yet possible all sorts of low' level “suboptimizing” is, 
and although suboptimizing has its dangers it is the way to begin. It can give 
valuable insights into improving both plans and operations and it is the vehicle 
for learning and for training the large numbers of analysts required for wide- 
spread application of these techniques. So my advice, which is free and freely 
given, is exercise patience and don’t expect too much too soon in other applica- 
tions. Gi\cn time, these are techniques of great promise for improving the 
planning and the performance of many large organizations. 



CONTEXTS 
AND PROGRAM 


TABLE DES MATIERES 
ET PROGRAMME 


Session I 

ADVANCES IN TECHNIQUES OF MATHEMATICAL 
PROGRAMMING 

PROGRES DANS LES TECHNIQUES DE PROGRAMMATION 
MATHEMATIQUE 

Session Chairman (President) j. Abadic 

Experiments and Statistical Data on the ExpencncesetStatistiquessurla Resolu- 
Solving of Large-scale Linear Programs non des Programmes Lineaires de Gran- 

des Dimensions 

Jacques de Bucket Jacques de Bucket 3 

The Slacked Unconstrained Minimize- La Technique de Minimisation avec 
tion Technique for Convex Program- Variables d’Ecart et sans Contrainte 
mine pour une Programme Convexe 

A. f\ FiaceojG. P. McCormick A. V. Fiacco'G. P. McCormick 13 

Applying Benders’ Partitioning Method Application dc la Methode de Panage 
to a Nonconvex Programming Problem de Benders 3 un Probleme de Program- 

matlon Non-Convexe 

C K.C.Mctz/R.X.Hotzard/J. M. THY- C.K.C. MetzfR.X. HozcardJ.M. ITjY- 
liamson Hams or. 22 

Some Problems in the Use of Linear Quclques Problemes dans PUtilisation 
Programming in Operation Planning de la Programmation Lincaire pour la 

Preparation des Operations 

S. Tamaki S. Tamaki 36 

Approximate Algorithm for Construe- AJgorithme Approche pour la Con- 
tion of Optimal Reliable S\stem with struction d'un Systemc Optimal sur 
Arbitrary Structure Avant unc Structure Arbitraire 

Igor A. Ushakov Igor A. Ushakov 44 

Session II 

PROGRESS IN TECHNIQUES OF DECISION THEORY 
PROGRES DANS LES TECHNIQUES DE THEORIE DE DECISION 

Session Chairman (President) H. Raifta 

Decision Analysis: Applied Decision Analyse des Decisions: Theorie Appli- 
Thcory quee des Decisions 

Ronald A, Harvard Ronald A. Howard 55 

The Application of Information Theory Application dc la Theorie dc L’lnforma- 
to a Model of Sequential Selection of tion 2 un Modcle de Choix Sequential 
Inspection Methods D’Inspections 

Henri Marty Henri Marty 72 

xxvii 



CONTENTS AND PROGRAM 


XXV111 

On the Application of Fiducial Prob- 
ability to Statistical Decisions 
G. MengesjH. Diehl 

Model Inference in Adaptive Pattern 
Recognition 

George R. Murray , Jr. (Richard D. Small- 
wood 

A Theory of the Control of the Cogni- 
tive Processes 
Nicholas M. Smith 
Optimal Dividend Policy 
Robert Wilson 

Session III 


Sur PApplication des Probabilites Fidu- 
ciales aux Decisions Statistiques 
G. MengesjH. Diehl 82 

Deduction a Partir d’un Modele dans 
la Reconnaissance Adaptive des Formes 
George R. Murray , Jr. j Richard D. Small- 
zcood 92 

UneTheorie surle Controle desMoyens 
de Perception 

Nicholas M. Smith 111 

Politique de la Distribution Optimale du 
Dividende 

Robert Wilson 128 


ADVANCES IN TECHNIQUES OF MODELING 
PROGRES DANS LES TECHNIQUES DE MODELE 

Session Chairman (President) G. E. Kimball 


Some Theoretical and Practical Aspects 
of the Adaptive Control Process 

J. Torrens-Ibern 

Zero-Zero Chance-Constrained Games 

A. CharnesjM . J. L. Kirby / W. M. Raike 
Solution of Some Surveillance-Evasion 
Problems by Methods of Differential 
Games 
J . M . Dobbie 

^-Optimal Strategies in Finite Stochas- 
tic Dynamic Programming 

Arnold Kaufmannj Roger Cruon 
The Organization of an Industrial 
System as a Servomechanism 
H. J, Grwnvald 


Quelques Aspects Theoriques ct Prati- 
ques des Processus de Comportement 
Adaptatif 

J. Torrens-Ibern 141 

Jeux Zero-Zero a Contraintes par le 
Ha sard 

A. Chames/M. J. L. Kirby jW. M. Raike 150 

Solution de Quelques Problemes Sur- 
veillance-Evasion par les Methodes de 
Jeux Differentiels 

J. M. Dobbie 170 

Strategies £-Optimales dans les Pro- 
grammes Dynamiques Stochastiques 
Finis 

Arnold Kaufmannj Roger Cruon 185 

L’Organisation d’un Complexe Indus- 
trie! Comme un Servo-mecanisme 
H. J. Grunzvald 201 


Session A 

THEORY OF GRAPHS 
THEORIE DES GRAPHES 


Session Chairman (President) C. Berge 


Flowgraphs Applied to Continuous 
Generating Functions 
C. S. Lorens 

Stochastic Networks and Research 

Planning 

Burton V. Dean 


Graphes des Flots Appliquds aux Fonc- 
tions Gen6ratrices Continues 
C. S. Lorens 205 

Reseaux Stochastiques et Planification 
de la Recherche 
Burton V. Dean 


215 



TABLE PES MAT I FIRES ITT PROGRAMME 


XXIX 


Algebraic Determination of I^oops and Determination Algebriquc dcs Coucles 

Paths in Graphs ct Chemms ties Graphes 

S. Ohada S> Okada 235 


Session B 


MARKETING 

PROBLEM ES COMMERCIAUX 
Session Chairman (President) J. D. C. Little 


Development, Validation, and Imple- 
mentation of Computerized Microanaly- 
tic Simulations of Market Behavior 

Arnold E. Amstutz 

Computer-Aided Preparation of Maga- 
zine Advertisement Formats 

Daniel S\ Diamond 

A Heuristic Approach to Some Sales 
Territory' Problems 

James B. Cl non an 

Issuing and Pricing Policies for Semi* 
pcmhablcs 

S. Eilon/R. V. Molly a 

A Geographic Model of nn Urban 

Automobile Market 

Theodore E. II fame, Jr. I John D. C. 

Little 

On Some Marketing Models 
Shiv K. Gupta 


Devcloppcment, Validation, ct Misc 
en Execution dc Simulations Micro- 
analytiqucs du Comportcmcnt d’un 
Marche a PAide d’un Ordinateur 
Arnold E. Ann lutz 241 

Preparation des Formes dc Publicity 
dans Ics Magazines a PAide d’un Ordina- 
teur 

Daniel S. Diamond 263 

Une Approchc Heuriatique de Quelqucs 
Problemes Rclatifs aux Territoircs de 
Vente 

James B. Ctoonan 284 

Politiqucs pour la Distribution ct la 
Determination du Prix des Den r£cs 
Scmi-perissables 

S . Eilon/R. V. Mallya 293 

Un Modelc Geographique d'un March6 
Automobile Urbain 

Theodore E. IJlavac , Jr. j John D. C. 

Little 302 

Sur Quelques Modtlcs de Marketing 
Shiv K. Gupta 312 


Session C 


TRANSPORTATION 

TRANSPORT 

Session Chairman (President) E. C, Williams 


Operations Research in North American 
Railroading 

Peter B. Wilson 

A Computer Simulation Model of Rail- 
road Freight Transportation Systems 

William P. Allman 


La Recherche Operationnelle dans les 
Compagnies Ferro via ires Nord-Amcri- 
caincs 

Peter B. Wilson 327 

Une Simulation sur Computer des R6- 
scaux Routicrs de Transport de Mar- 
chan discs 
William P. AUtnan 


339 



xxx 


CONTENTS AND PROGRAM 


Network Retrieval of Freight Rates 
William D. Cassidy 

Operations Research Activities on the 
Japanese National Railways 
Shun-Ichi Urabe 
S a fety- at-Sea Problems 

Arne Jensen 

A Formal Description of Steamship 

Cargo Operations 

R. J. ParentejD. F. Boyd 

Operational Research in Urban Public 

Passenger Transport in the United 

Kingdom 

P. I. Welding 

The Design of Routes and Service Fre- 
quencies for a Municipal Bus Company 

A . II. Lines I W. LampkinfP. D. Saalmans 
Shortest Distances Through Parti- 
tioned Graphs with an Application to 
Urban Transportation Planning 

Charles W. Blumentritt 

Traffic Assignment by a Stochastic 

Model 

Has so ton Falkenhatzsen 

Optimum Design of Highways 

J. L. GrcboillotfJ . L. Deligny 

Report on the Activities of AGIFORS, 

1963-1966 

J. Taylor 

Demand Forecasting for Airline 
Scheduling 

G. S. Shazv jj, W. Abrams 


Un Recouvrement des Tarifs Mar- 
chan discs par un Reseau 
William D. Cassidy 352 

Activites de Recherche Operationnelle 
aux Chemins de Fer Nationaux Japonais 
Shun-Ichi Urabe 359 

La Securite dans des Problemes Man- 
times 

Arne Jensen 362 

Une Description Formelle des Opera- 
tions d’un Paquebot Cargo 
R. J. ParentejD. F. Boyd 370 

La Recherche Operationnelle dans Ies 
Transports Publics Urbains dans le 
Royaume-Um 

P. I. Welding 384 


La Determination des Itineraires et de la 
Frequence des Sendees d’une Com- 
pagnie Municipale d’Autobus 
A. H. Linesj W. LampkinjP. D. Saalmans 395 
Plus Courtes Distances par la Methode 
des Graphcs Cloisonnees avee une 
Application a la Preparation d’un 
System e de Transports Urbains 


Charles IT. Blumentritt 405 

L’Affectation du Trafic par un 
Modele Stochastique 

Hasso ion Falkenhauscn 415 

Optimalisation des Projets Routiers 
J. L. GroboillotlJ. L. Deligny 421 

Rapport sur les Activites de PAGIFORS, 
1963-1966 

J. Taylor 440 

Prevision de la Demande pour l’Etablis- 
sement des Horaires d’une Compagnie 
Aerienne 

G. S. ShazcjJ. W. Abrams 444 


Session D 

URBAN PLANNING 
PLAN D*URBANISME 

Session Chairman (President) R. L. Ackoff 

Strategies for Operations Research in Strategies pour la Recherche Opera- 
Urban Metropolitan Planning tionnelle dans la Planification Urbaine 

et Metropolitaine 

Britton Harrisj Russell L. Ackoff Britton Hams I Russell L. Ackoff 471 



TABLE DE3 MATIERE3 ET PROGRAMME 


XXXI 


A Decision-Oriented Model of Urban 
Growth 

David R. Seidman 

Experiments with a Matrix Model of 
Population Growth and Distribution 

Andrei Roger t 


Un Module Oriente vers la Prise de Deci- 
sion pour la Croissance Urbaine 
David R. Seidman 480 

Experiences avee un Module Matriciclle 
de Croissance ct Distribution de Popula- 
tion 

Andrei Rogers 490 


Session E 

INVESTMENT POLICY ANALYSES 
ANALYSES D’INVESTISSEMENTS 

Session Chairman (President) J. Lesoume 


The Method of Extended Models: 
Definition: Application of the Method 
to Investment Selection 
F. Bessiere 

Investment Problems 

J. R. Barraj A. Segond 

Method of Selecting Investments and 

Making Allocations 

L. ThirietjJ . Gaussens 

Policy of Sequential Adaptation of 
Production Techniques to Electric 
Power Consumption 

M. Albouy 

A Dynamic Approach to Capital Invest- 
ment — A Case Study 

J. H. Culhane/A * M. RonaldsonjR. M. 
Zimmermann 


La M£thode dcs Modeles Elargis: Defi- 
nition: Application a un ModMe de 
Choix dcs Investissements 
F. Bessiere 503 

Problemes d’Investissement 
J. R. Barraj A. Segond 513 

Methodc de Choix des Affectations et 
dcs Investissements 

L. ThirietlJ. Gaussens 527 

Politique d’Adaptation Scqucnticlle des 
Moyens de Production a la Consomma- 
tion d’Energie Electrique 
iU. Albouy 538 

Unc Approchc Dynamiquc au Pro- 
blemc dc ITnvcstissement du Capital — 

Etude d'un Cas 

J . H. Culhane/A . M. Ronaldson/R . M. 
Zimmermann 551 


Session F-I 


SCHEDULING PROBLEMS — I 
PROBLEMES DE PRODUCTION ET DE GESTION — I 


Session Chairman (President) E. Nicvcrgelt 


Optimization Algorithms for a General 
Class of Scheduling Problems under 
Resource Limitations 

R. DcseampsiP. Chrvignon 
Scheduling Problems with Cumulative 
Constraints 
Michel Conan 


Algorithms d’Optimisation pour une 
Classe Gen£rale de Problemes d’Ordon- 
nancement avee Limitations dcs Re- 
sources 

R . DescamptjP, Chrvignon 567 

Problemes d’Ordonnanccment avee 
Contmintes Cumulative* 

Michel Conan 583 



XXXII 


CONTENTS ANT) PROGRAM 


Session F~II 


SCHEDULING PROBLEMS— II 
PROBLEM ES DE PRODUCTION KT DE GESTION— II 


Session Chairman (President) C. A. Zender 


Sequencing Against Due-Dates 

R. IV. ComcayllV . . L. MaxcvclllJ. IF. 
Oldzicy 

Scheduling for Missile Ranges 

John R. Nortonj Kenneth IP. HVW; 

The Resolution of Conditional Schedul- 
ing Conflicts 
James F. Rial 

Heuristics for Resolution of Logical 

Scheduling Conflicts 

Leo C. DristofijLcc Sitycmnto 

The Assignment Interference Mode! 

Raoul J. Freeman 

Session il-I 


Ordonnanccmcnt pour les Travaux a 
Tcrme 

R . IP. CnrtuayjU'. L. MaxrcelljJ, IP. 
Oldzicy 599 

Ordonnanccmcnt pour les Bases dc 
Fusses 

John R. Nortonj Kenneth IP. Webb 11$ 

La Resolution den Conflits d'Ordon- 
nanccntrnt CondttionncR 
James F. Rial 633 

I letiristiqucs pour la Resolution des 
Conflitn Logiques d'Ordonnnnccmcnt 
Leo (\ DniCoHif.ee Suvtmotn 651 

Le Problem * 4 d'Affrctaiion avec Inter- 
ference 

Raoul J . Freeman 6S0 


SIMULATION — I 
SIMULATION — I 

Session Chairman (President) Is. I), Tocher 


The State of the Art of Simulation— A 

Survey 

K. D. Tocher 

Statistical Approach for Validating 
Simulation Models by Comparison uuh 
Operational Systems 

A. V. GafarianjJ . E . Walsh 
Compiling Strategies for Some CSL 
Implementations 
A. T. ClemrnisonjJ . Rtnton 


La Situation de I* Art de la Simulation — 

Vur Gent rale 

K. D. Tocher 693 

Unt* Approchc Statist ique pour Valider 
den Modeled de Simulation par Cnm- 
pa raison avec ties Hyst ernes Operation* 
nels 

A. I\ GafarianjJ . E. Walsh 702 

La Compilation de Strategies pour 
Quelques Applications dc CSL 
A. T. Clement sonjj. Ruxtort 705 


Session U-li 

SIMULATION— II 
SIMULATION — II 

Session Chairman (President) K. D. Tocher 


Simulation of Fleet Rotation 
J • AfjardjP. JacguetjJ . Vnutirr 


Simulation de Rotation d*une Flotte 
J . AtuirdjP . JarquetJJ . Van tier 


723 



TABLE DES MATIERES ET PROGRAMME 


XXX11I 


Simulation of Random Readership 
Behavior: The Sea! Model 
P. BcrtierjH Le Boulanger B Roy 
Comparing Results from a War Game 
and a Computer Simulation 

A/. D. F Boulton'S J Hopkins* J. B 
Fowl ll f . IP. Fain 


Simulation des Comportements Aleatoi- 
res de Lecture: Le Modele Seal 
P. Bertirr H. Le Boulanger \B. Roy 732 

Comparaison des Resultats d’un Jeu de 
Guerre et d’une Simulation par Ordina- 
teur 

A/ D . F Boulton IS. J . Hopkins’]. B. 
FawllV. \V. Fain 739 


Session I 

NATURAL RESOURCES 
RESOURCES XATURELLES 

Session Chairman (President) G. H. Symonds 


Linear Dynamic Decomposition Pro- 
gramming of Optimal Long-Range 
Operation of a Multiple Multipurpose 
Reservoir System 

Shailettdra C. PankJi 

An International Agricultural Model 

A/. Desport JM. Vcrcueil 

Optimal Development of Underground 

Petroleum Reservoirs — Part 1 

J. S . Aronofsky / R. C. RcinitzjlL Dogru- 

soz 

Operations Research in the Coal 
Industry* in India 
A. Ghosal 


Programmntion de Decomposition 
Dynamique Lineairc du Fonctionne- 
ment Optimal a I^ong Terme d’un 
Systeme de Reservoirs a Objectifs 
Multiples 

Shailendra C. Parikh 763 

Modele Agricole Interregional 

A/. DesportjM . Vermeil 779 

Devcloppcment Optimal des Reservoirs 

Souterrains de Petrole — Partie 1 

J. S. Aronofsky j R. C. Reinitz\H * Dogru- 

soz $00 

La Recherche Operationnelle dans 

Plndustrie du Charbon en Indc 

A. Ghosal 814 


Session J 

DISTRIBUTION SYSTEMS 
SYSTEMES DE DISTRIBUTION 

Session Chairman (President) S. Eilon 


A Study of a Sequential Investment 
Policy* for the Restructuring of a Plant 
Network 

Georges Contest Jeem-Louis Bellon 
An Optimi7ation Algorithm for a Class 
of Distribution Problems 
Rene Dcseamps 

Control of Excess Stock in a Multiu are- 
house System 
A* D. Davies'S, B, Heller 
An Integrated System for Inventory 
Management in a Retail Chain Store 

Jacques Me!ese}Pierrc Elina 


Etude d’une Politique Sequcntiellc 
d’Investissements pour la Restructura- 
tion d’un Reseau d’Usines 
Georges Comes} Jean -Louis Bellon S25 

Un Algorithmic d’Optimisation pour 
unc Clause de Problemes dTmplantation 
Rene Dcseamps $34 

ContrMc dc I’Excedent de Stocks dans 
un Systeme d’Entrepbts Multiples 
A. D. Denies} S. B. Heller ’ 842 

La Gcstion par un Systeme Integre des 
Approvisionncments d’unc Chaine de 
Magas ins 

Jacques A LI he} Pierre Eli na S 5 1 



XXXIV 


CONTENTS AND PROGRAM 


A Two-Level Inventory Control Rule Une Regie de Gestion des Stocks a Deux 
in a Distribution System Niveaux dans un Syst£me de Distribu- 

tion 

J. A. Cran J. A. Cran 864 

Session K-I 

GENERAL CONTRIBUTED PAPERS — I (STOCHASTIC MODELS) 
RAPPORTS GENERAUX OFFERTS— I (MODELES STOCHASTIQUES) 


Session Chairman (President) E. Kocnigsberg 

Use of the Weibull Distribution in Utilisation de la Distribution de Weibull 
Bayesian Decision Theory" dans la Theoric Bayesicnne des Deci- 

sions 

R. M. Soland R . M. Soland 881 

Totally Positive Stochastic Wear Pro- Processus Stochastiqucs d’UsureTotale- 
ccsses ment Positifs 

R, C. Morey R. C. Morey 892 

The Uncertainty of Uncertainty L’Incertitudc dc P Incertitude 

P. C. Fishburn/A. H. Murphy P, C. FishburnjA. H. Murphy 906 

A Class of Two-Dimensional Queues Une Classe dc Files dWttente a Deux 

Dimensions 

Richard V. Evans Richard V. Evans 914 

Queuing with Strict and Lag Priority Files d’Attcnte de Priority Stricte et 
Mixtures Non- Stricte 

Leonard Kleinrock Leonard Kleinrock 921 


Session K-I I 

GENERAL CONTRIBUTED PAPERS — II (BUSINESS APPLICATIONS) 
RAPPORTS GENERAUX OFFERTS— II (APPLICATIONS COMMERCIAUX) 

Session Chairman (President) R. W. Shephard 


Determining Future Equipment Needs 

( W Pierre Gaussens/Rene Pelissier 
A Real-Time Study of Information 
Requirements for Project Selection in 
Research and Development 
Albert H, Rubenstein 
Business Games, Programmed Players, 
and Individual Decision-Making Pro- 
files — Basic Technique 

J . M. Braasch 

On Analyzing the Economics of Com- 
puter Program Development: A Pro- 
gress Report and a Prognosis (Sum- 
mary) 

George F. Wcinwitrm 


Etude de la Gestion des Equipments 
Futurs 

Pierre Gaits sens j Rene Pelissier 939 

Un Etude cn Temps R£el des Besoins en 
Information pour la Selection d’un 
Projct cn Recherche ct Developpcment 
Albert H, Rubenstein 947 

Jcux d’Entrcpriscs, Joueurs Program- 
mes et Profils Individucls dc Processus 
dc Prise dc Decision — Technique de 
Base 

J. M. Braasch * 959 

Sur T Analyse de Ddveloppemcnt des 
Programmes de Calculatriccs: Compte 
Rendu ct Prognostic (Sommairc) 

George F. Weiniuurm 


974 



TABLE DES MAT I EKES ET PROGRAMME 


XXXV 


Session K-III 

GENERAL CONTRIBUTED PAPERS— III (NEW APPLICATIONS) 
RAPPORTS GENERAUX OFFERTS— III (APPLICATIONS NOUVELLES) 

Session Chairman (President) W. Jewell 


Scientific Method in Rock Excavation 
E. G. Loseh 

A Crude Model of Tuberculosis Epide- 
miology 

Emilio C. Venetian 
A Police Conflict Model 
Macon Fry 

The Maximum-Minimum Method for 
the Solution of a Nonlinear Dynamic 
Programming Problem with Discrete 
Values of the Unknown Variables 
P. Renouard 

Decomposition in Traveling-Salesman 

Problems 

T. C. Hu 


M6thodc Scicntifique dans PExcavation 
Rocheuse 

E. G. Loseh 983 

Un Module Simple pour la Propagation 
de la Tuberculose 

Emilio C. Venetian 991 

Un Modelc de Confiit Policier 
Macon Fry 1002 

Methode du Gradient Maximum-Mini- 
mum pour la Solution d’un Probl&me 
de Programmation Non Lineaire, avec 
Valeurs Discretes des Inconnues 
P. Renouard 1015 

Decomposition dans les Problfcmcs du 
Vovageur de Commerce 
T.C.Hii 1021 


Session K-1V 

GENERAL CpNTRIBUTED PAPERS— IV 
(MATHEMATICAL PROGRAMMING) 
RAPPORTS GENERAUX OFFERTS— IV 
(PROGRAMMATION MATHEMATIQUE) 

Session Chairman (President) J. C. Mathicu 


New Results on the Method of Centers 
P. FaurejP. Iiuard 

Generalization of the Wolfe Reduced 
Gradient Method to the Case of Non- 
linear Constraints 
J. CarpenticrjJ. Aba die 
A Markovian Procedure for Strictly 
Concave Programming with some Linear 
Constraints 
Arthur M. Geoff rion 
An Application of Linear Programming 
to Mergers 

R. P . Jackson 


R^sultats NouveauxRclatifs h la Methode 
des Centres 

P. FaurejP. Httard 1033 

Generalisation de la Methode du 
Gradient Rcduit de Wolfe au Cas de 
Contraintes Non Lin&iircs 
J. CarpenticrjJ. Aba die 1041 

Unc Procedure Markovienne pour la 
Programmation Strictcmcnt Concave 
avec des Contraintes Lineaires 
Arthur l\f. Gcojjrion 1053 

Une Application de la Programmation 
Lineaire aux Fusions 

R. R. P. Jackson 1059 



XXXVI 


CONTENTS AND PROGRAM 


INFORMAL MEETINGS, CONFERENCE SUMMATION, AND CLOSING 
ASSEMBLES OFFICIEUSE, 

SOMMAIRE DE CONFERENCE ET CONCLUSION 


Summary of the Session on Compara- 

Sommaire de T Assemble dc la Recher- 


tive Operations Research 


che Opdrationnellc Comparative 


Melvin F. Shakun 


Melvin F. Shakun 

1069 

Summary of the Session 

on Military 

Sommaire de V Assemble des Appli- 


Applications 


cations Militaircs 


J. A. Bruner 


J . A. Bruner 

1073 

Summary of the Session 

on PERT/ 

Sommaire dc PAssemblee de PERT/ 


CPM 


CPM 


H. V. Murphy 


H. V . Murphy 

1074 

Conference Summation 


Sommaire de Conference 


Philip M. Morse 


Philip M. Morse 

1075 

Conference Closing 


Allocution Finale 


M. Boitcux 


M. Boitcux 

1081 

INDEX 



1085 



PROCEEDINGS OF THE FOURTH 
INTERNATIONAL CONFERENCE 
ON OPERATIONAL RESEARCH 


ACTES DE LA QUATRIEME 
CONFERENCE INTERNATIONALE 
DE RECHERCHE OPERATIONNELLE 



SESSION I 


ADVANCES IN TECHNIQUES 
OF MATHEMATICAL PROGRAMMING 


dans les Techniques de Programmation Mathematique 


Chairman: J. Abadie (France) 




EXPERIENCES ET STATISTIQUES SUR LA RESOLUTION 
DES PROGRAMMES LINE AIRES DE GRANGES DIMENSIONS 

Experiments and Statistical Data 
on the Solving of Large-scale Linear Programs 

Jacques de Bucket 

Ingenieur en Chef a la Societc d'Informatiquc Appliquee , 

Paris , France 


1. INTRODUCTION 

Dcpuis Novembre 1964 nous avons utilise le code de programmation lincaire 
OPHELIEf pour la resolution de plusieurs centaines de programmes lineaires 
dc toutes tallies dont certains presentaient un nombre eleve de contraintes 
s’etageant de 1000 a plus de 2500. Nous avons pu ddgager tout au long de ces 
calculs certaincs experiences qui montrent que la mdthode du Simplcxe sous la 
forme produit dc Pinvcrse est toujours adaptee a une resolution dirccte et sans 
decomposition de tels programmes lineaires. Nous avons aussi pu obtenir des 
resultats pratiques sur quelques algorithmes utilises dans un code de program- 
mation lincaire. Pour allegcr la presentation, nous ne nous refererons qu’a des 
points moyens d’analyses statistiques, et non pas a un grand nombre de 
problemcs. 


2. LES PROBLEMES 

En Pabscnce dc problemcs dc reference de dimensions elevecs, nous citerons 
7 problemcs (tableau 2.1) qui n’ont pas cte construits spccialemcnt pour ces 
cssais, mais qui correspondent tous a des programmes en production courante. 
Lcurs dimensions s’echelonnent entre 1221 contraintes et 2617 contraintes. 
Cc nombre comprcnd toutes les fonctions economiques evcntuelles qui figurent 
dans la matricc sous forme de lignes additionclles. 

Le nombre de variables nc comprcnd pas les variables d’ecart qui sont 
gencrecs pour transformer les inegalites cn egalites, le nombre des coefficients 
est cclui dc la matrice avant cctte transformation. La densite du probl^me est le 
rapport du nombre dc coefficients de la matrice et du produit du nombre dc 
contraintes par le nombre de variables. Nous exprimerons cctte densite en 
pour miilc (% 0 ). 

f Cc code a Ht realise sous la direction dc I’autcur pour le CONTROL DATA 
3600 — Temps de base 1,5 /is, cnpacite de memoire rapide 65528 mots dc 48 bits. 

3 



4 


JACQUES DE BUCHET 

Tableau 2.1 
Les Problemes 


Nom du probleme 

A 

B 

C 

D 

E 

F 

Gf 

Nombre de 

contraintes 

1221 

1467 

1564 

1835 

2070 

2151 

2617 

Nombre de variables 

2286 

1311 

6096 

3912 

1880 

4620 

5713 

Nombre de coeffi- 

cients non nuls 

17467 

6037 

44107 

28030 

11739 

15388 

29810 

Densite % 0 

0,6.2 

0,3.1 

0,4.6 

0,3.9 

0,3.0 

0,1.5 

0,1.9 


f Ce probleme comportait une base de depart provenant de calculs antdrieurs. 


3. ALGORITHME DE CADHAGE 


Le probleme etait d’obtenir une representation condensee des elements de la 
matrice d’entree sans modifier leur precision. 

On a souvent remarque que des ordres de grandeurs tres differents dans les 
elements d’une matrice provoquaient des difficultes numeriques dans Pinversion 
de ces matrices; aussi vient-il naturellement a Tesprit de recadrer les coefficients 
de ces matrices en multipliant lignes et colonnes par des constantes. 

Cette procedure permet en outre une condensation notable du volume 
ndcessaire h Pintdrieur du calculateur pour garder ces coefficients. Voir annexe 1. 

Nous avons voulu cadrer ces coefficients autour de Punite et avons done du 
caractdriser Pecart d’un coefficient ay par rapport a Punite, soit: 


€ i,j = 


a iti + 


a iJ 


( 1 ) 


II s’agit de trouver des multiplicateurs de lignes If et de colonnes fy teds 
qu’apres multiplication par ces facteurs de cadrage, on ait pour les nouveaux 
coefficients e it j de la matrice 


minimum de 




-J—Y 

^i, 'j 


En d6rivant (2) par rapport a l { et kj, on obtient 


(2) 


2 ? ? fa li ^ a?,!? *,*) dli + 2 1 1 ( fl b' l ? k i ~ a% dkj 

Au minimum de cette expression, on obtient deux systemes d’egalites: 


2(«L^/) = 2^-T273 

3 j H 

% 1 %jH Rj 


( 3 ) 


( 4 ) 




RESOLUTION DE3 PROGRAMMES LINEAIRES DE GRANDES DIMENSIONS 


5 


(3) exprime quc dans une colonne/ la sommc des carres des elements transformes 
est egale a la somme des carres de leurs inverses; (4) exprime la meme chose 
mais pour une ligne. 

On pcut resoudre les systemes (3) ct (4) par iterations successives, en 
determinant des coefficients de cadrage colonne kj 1 tels que dans chaque 
colonne (3) soit satisfait, puis en determinant des coefficients de cadragc lignc 
/f 1 tels quc dans chaquc ligne (4) soit satisfait avec les valeurs precedentes de kj 1 . 
On itcre ainsijusqu’a convergence. 

Cct algorithm e de cadrage est extremement simple, il suffit de lire a chaque 
iteration la matrice qui est rangec par colonne pour determiner les facteurs de 
cadragc colonnes puis lignes. 

Les resultats pratiques de ^application de cct algorithme aux problemes 
A, B, D (voir tableau 2.1) sont resumes sur les figures 1 ct 2. Nous avons porte 
sur la figure 1 des resultats relatifs au probleme C, on y trouve la repartition des 
coefficients de la matrice suivant leurs valeurs. Les coefficients ont ete arrondis a 
la puissance de 2 la plus proche. 

La courbe 1 est relative a la matrice avant cadrage, la courbe 2 a la matrice 
apres cinq iterations de cadrage alors que Palgorithme de cadrage a converge. 
On peut voir que la courbe 2 est bcaucoup plus reguliere et est mieux cen tree 
autour de 2°. Le rapport entre les elements maximum et minimum qui etait 
de 2 28 avant cadragc est tombe a 2 16 . 

Le figure 2 donne les resultats du cadrage pour plusieurs problemes au fur 
et a mesure des iterations de cadrage. Nous avons caracterise la distribution 
des elements de la matrice suivant leur valeur par la variance de la distribution 
ct nous l’avons fait figurer en ordonnees. 

On trouve en abcisse le numero de Pitdration de cadrage au moment ou la 
mesure a etc faitc. L’it^ration 0 represente Petat initial de la matrice. 

La figure 2 suggerc deux remarques: 

— La variance rcste pratiquemment inchangee apres la premiere iteration. 

II est done inutile de faire plus d’unc iteration de cadrage. 

— Pour divers problemes, la variance devient tres voisine apres une iteration. 

L’application d’une telle methode de cadrage n’a pu se faire que grace au 
caractere peu dense des matrices et a la faiblessc des connexions existant entre 
les lignes d’unc part et les colonnes d’autre part, cc qui correspond bien a la 
nature economiquc des modelcs. 


4. LES BASES DE DEPART, LA PHASE 1 

Nous supposcrons quc sont connues les etapes de la mdthode du simplexe telles 
qu’clles sont donnecs par [1]. 

Parmi les problemes envisages, sculs les problemes A, C, E, F ne comportaient 
aucunc base de depart. La base qui a alors ete choisie comporte les singletons 
(colonne n’ayant qu*un element qui est de meme signe que relement correspond 
dant du second membre) ct des colonnes de ia matrice choisics de fa^on a nc pas 
gencrcr de singularitc et a faire disparaitre au moins une impossibilite (variable 



10,000 



Figure 1. Repartition des coefficients du problfcme C avant et apres cadrage. 



Figure 2. Evolution de la variance de la distribution des coefficients en fonction du 
nombre d’it£rations de cadrage pour les problemes A, B, D. 


6 






RESOLUTION DES PROGRAMMES LINEAIRES DE GRANDES DIMENSIONS 


7 


artifkielle non nullc ou bien variable negative) la base est completee avec des 
variables artificielles on bien des variables negatives. La phase 1 ou recherche 
d’une solution realisable est faite sulvant unemethode derivee de celle que Wolfe 
[2] considerc comme la plus efficace. On cherche a ramener la somme des valeurs 
absolues des impossibilites a zero en introduisant dans la base, a ebaque iteration 
les deux colonnes prdsentant les couts de substitution ( dj ou Z; — Cfj les plus 
eleves. La variable a faire sortir de la base est choisie de fafon a ne pas introduire 
dc variable negative, on introduira en premier celle des deux qui fera dispa- 
raitre le nombre maximum d’ impossibilites, la seconde variable choisie ne sera 
introduite dans la base que si elle n’augmente pas la somme des impossibilites. 
Lors d’une iteration, nous pouvons done introduire eventuellement deux 
variables, nous dirons que nous avons effectue un double pivotage et un simple 
pivotage dans le cas contraire. Cette methode cst employee egalement en phase 
2, lors de I’optimisation de la fonction economique. 

Le tableau 4.1 donne pour les problemes A, C, E, F, G le nombre d’impos- 
sibilitds au depart du probleme, le nombre d’iterations pour eliminer ces impos- 
sibilites, le rapport entre ces deux nombres. On peut voir que ce rapport est 
relativement constant excepte pour le probleme F, qui a une densite tres faible. 


Tableau 4.1 

Comparisons Entre le Nombre d' Iterations en Phase 1 et le Nombre d’ Impossibilites 


Norn du probleme 

A 

C 

E 

F 

G 

Nombre d’impossibilites au 
debut du probleme 

480 

190 

540 

1768 

377 

Nombre d'iterations de phase 1 
Rapport entre le nombre 
derations dc phase 1 et le 

4500 

1540 

4700 

1928 

2700 

nombre d’impossibilites au 
debut du probleme 

9,4 

8,1 

8,7 

LI 

7,2 

Nombre dc contraintcs 

1221 

1564 

2070 

2151 

2617 


Le tableau 4.2 indique dans combien de cas sur 100 iterations la deuxieme 
variable choisie comme ayant un des deux plus grands couts de substitution 
amdliorc la fonction economique ou La somme des impossibilites apres introduc- 
tion de la premiere. 


Tableau 4.2 


Comparaison Entre le Nombre d' Iterations a Simple et a Double Pivotage 


Nom du probleme 

A 

C 

E 

F 

G 

Pourccmagc d'itdralions a 

double pivotage 

50 

58 

70 

58 

62 






8 


JACQUES DE BUCHET 


5. NOMBRE DTTERATIONS NECESSITE PAR LA 
RESOLUTION COMPLETE DES PROBLEMES 

Bien que le nombre de problemes resolus sans qu’une base de depart soit foumie 
soit faible, nous avons tenu a faire figurer les chiffres correspond ants dans le 
tableau 5.1. 


Tableau 5.1 

Nombre Alterations Total 


Nom du probleme 

A 

C 

E 

F 

Nombre d’iterations 

8700 

7413 

5965 

3331 

Nombre de contraintes 

1221 

1564 

2070 

2151 

Densite %o 

0,6.2 

0,4.6 

0,3.0 

0,1.5 


11 est trbs surprenant de voir que le nombre total d’iterations n’augmente 
pas avec le nombre de contraintes. La raison de ce paradoxe est 1’ influence 
de la densite qui, dans ces problemes, a varie de fa£on au moins inversement 
proportionnelle au nombre de contraintes. 


6. STRUCTURE D’UNE ITERATION 

Nous rappelerons la structure d’une iteration de la methode du simplexe sous 
la forme produit de l’inverse. 

Le choix d’une colonne candidate Aj se fait en effectuant le produit: 

dj — VB^Aj = Vn p X n p ~i X * ■ * X n\Aj 
dj — cout de substitution de la variable Aj 

V ~ vecteur ligne n’ayant qu’une composante non nulle correspondant a la 
ligne de la fonction economique 

tip X n v ~i X * * * X n\ inverse de la base actuelle sous la forme 

produit de Tinverse. 

Le choix des colonnes candidates s’eflfectue en deux parties: 

U = Vn P X n v - 1 X * • * X n\ appele produit retrograde ou backward 
dj= UAj calcul des dj pour toutes les variables hors 

de la base. 

Une fois la colonne Ajc choisie, il faut determiner quelle est la variable 
qu’elle va remplacer. On effectue les produits: 

n v x n v -i x •• * X niAk produit dans le sens direct ou forward 

de la premiere colonne choisie 

Yk'~ rip X n v - 1 X - * * X n\A & produit forward de la deuxieme colonne 

choisie. 



RESOLUTION DCS PROGRAMMES LINEAIRES DE CRANDES DIMENSIONS 9 

Lc tableau 6.1 donne pour Ics divers problimcs les durees en secondc 
dcs elements suivants: 

— produit baekzvard , 

— calcul dcs dj , 

— produit forzvard pour les deux eolonnes choisies, 

— la somme des trois durees precedentes, 

— la durde moyenne dc Titcration telle qu’elle a pu ctre mesuree par 
ailleurs. 


Tableau 6,1 

Comparison Entre les Temps Element air cs Exprimes en Sccondcs des Divers 
Calculs Effcctucs Pendant une Iteration 


Nom du problemc 

A 

C 

E 

F 

G 

Ti— produit baclnvard 

0,510 

0,348 

0,360 

0,289 

0,647 

Tz ~ calcul dcs dj 

0,726 

1,938 

0,791 

1,133 

1,696 

Ta — produit forzvard 

0,566 

0,236 

0,247 

0,205 

0,553 

Tt + Tz+Tz 

1,802 

2,522 

1,398 

1,627 

2,896 

Durec moyenne d’une iteration 
Rapport entre Ti 4' T« 4* Tz 

2,46 

2,58 

2,2 

2,09 

3,6 

ct la durdc moyenne d’unc 
iteration 

0,73 

0,97 

0,63 

0,78 

0,80 

Nombrc de contrnintes 

1221 

1564 

2070 

2151 

2617 


On peut voir sur lc tableau 6.1 la part tres faiblc de la durdc du calcul du 
produit forzvard dans une iteration 23% au maximum, calcul correspondant 
a Tintroduction evcntuclle de deux eolonnes dans la base. On peut aussi 
remarquer la part relativcment importante du calcul des dj 48%. 

En rapprochant ces deux chiffres et lc pourccntagc des iterations a double 
pivotage (tableau 4.2), on voit tout Tintcret dc cctte operation et peut etre 
memo d’un pivotage multiple. 

Le tableau 6.2 suggcrc d’autres remarques, on peut voir que pour certains 
problimcs, le pourcentage entre la durec cumulee des produits, backward y 
forzvard et du calcul dcs dj n’atteint que 0,63% dc la durec moyenne de 
i'itcration, problemc E. La raison en est la suivante: le probleme E est celui 
dont la phase 1 est la plus longue ct il y a precisdment en phase 1 un calcul 
dont la durec est sensible ct qui correspond a la determination de la variable 
a faire sortir de la base. 

„ . Xj 

0— min — 

Cc calcul est a cffcctucr plusicurs fois; pour ne pas introduirc de variables 
negatives, pour nc pas rendre unc variable artificiellc negative, pour climincr 
lc maximum dc variables negatives. Alors qu’il ne sc fait qu’unc fois en 
phase 2. 





10 


JACQUES DE BUCHET 


7. LES INVERSIONS 

Pour diminuer le volume total des matrices products de l’inverse qui augmente 
a chaque iteration, on procede a des inversions qui reduisent la duree des 
iterations ulterieures ainsi que Teffet des erreurs d’arrondi. 

La methode de reinversion qui a ete adoptee est la suivante: la ligne pivot 
d’une matrice elementaire est choisie comme etant celle qui a le moms 
d’elements non nuls dans la partie non encore reinversee de la matrice, la 
colonne pivot sera choisie parmi celles qui possedent un element sur la ligne 
pivot et dont le nombre total d’elements est le plus faible. 

Cette reinversion est declenchee de fa^on a optimiser le temps de resolution. 

Dans le tableau 7.1 figurent les nombres d’iterations et reinversions 
effectuees sur chaque probleme. Une des caracteristiques principales est la 
frequence elev^e de ces inversions, qui est due a leur rapidite. II y a en moy- 
enne une rdinversion pour 30 iterations. 


Tableau 7.1 

Compar arson Entre le Nombre d y Inversions et le Nombre d y Iterations 


Nom du probleme 

A 

C 

E 

F 

G 

Nombre total d’iterations 

8712 

7413 

5965 

3331 

6689 

Nombre total de reinversions 

407 

194 

282 

70 

275 

Rapport nombre iterations/ 
nombre reinversions 

21 

38 

21 

48 

24 

Le tableau 7.2 permet d’effecteur 

des comparaisons 

entre 

les durees 

moyennes de reinversion pour chaque probleme, 

le nombre d’elements non 

nuls des bases a inverser et le nombre d’elements non 

nuls de toutes les 


Tableau 7.2 




Comparaisons Entre les Durees 

des Inversions et Entre le Nombres d' Elements 

Avavf 

et Apres 

Reinversion 



Nom du probleme 

A 

C 

E 

F 

G 

Duree des reinversions en 
seconde 

Nombre d’elements apres 

20 

10,5 

9,6 

10 

24 

reinversion 

Nombre d’elements de la 

11862 

7885 

6946 

5138 

11886 

base 

6676 

7083 

6281 

4760 

9883 

Rapport nb d’elements apres 
reinversion/nb d’elements de 
la base 

1,78 

i,n 

1,1 

1,08 

1,2 





RESOLUTION DES PROGRAMMES LINEAIRE3 DE GRANDES DIMENSIONS 11 

matrices produit de Pinverse. Ce dernier nombre peut varier suivant la 
methode d’inversion utilisee. 

Le rapport entre le nombre d’elements des matrices inverses apres inver- 
sion ct le nombre d’eldments de la base est done une caracteristique de la 
methode d’inversion. Ce rapport est fortement influence par la densite du 
probleme origine. II est de 1,78 pour un probleme de densite 6,2^ contre 
1,15 en moyenne pour des probldnes dont la densite moyenne est voisine de 
3,3%o. On remarque aussi que la duree d’une inversion est pratiquemment 
inddpendante du nombre de contraintes mais qu’elle depend par contre du 
nombre d’elements generes. 

Dans le tableau 7.3 figurent des elements de comparison entre les temps 
totaux necessaires a la resolution d’un probleme et le pourcentage de ce temps 
consacrd aux reinversions. Ce pourcentage parfois Eleve comme pour les 
problemes A et G est malgre tout relativement faible si on tient compte du 
nombre eleve de reinversions qui ont lieu, en moyenne 1 pour 30 iterations 
(voir tableau 7.1), sa faiblesse tient a notre avis a la duree tres brEve des 
reinversions. 


Tableau 7.3 

Comparaison Entre la Duree de Resolution et la Duree Cumulec des 

Reinversions 


Nom du probleme 

A 

C 

E 

F 

G 

Duree totale de resolution (mn) 
Duree totale cumulee des 

493 

354 

269 

128 

512 

reinversions en mn 

Rapport entre le temps consacrE 
aux reinversions et a la durde 

135 

34 

45 

12 

110 

totale de resolution 

0,27 

0,10 

0,17 

0,10 

0,21 


8. EVALUATION DE LA DUREE DE RESOLUTION 

Comme Wolfe dans [2] nous evaluerons la duree de resolution en nombre 
d’operations ElEmentaires. Une operation elementaire comprenant une addition 
ct une multiplication flottantes. La duree de Poperation Elementaire sur le 
CONTROL DATA 3600 est de 10 microsecondes. Nous avons converti dans 
le tableau 8.1 les durees de resolution en nombre d’operations et effectue 1c 
rapport du nombre d’operations et du cube du nombre de contraints N/M 3 . 
Cc rapport varic de fagon tres importante entre un probleme dense (A) et un 
probleme peu dense (C). II semble qu’une formule faisant intervenir le carre 
de la densite et le cube du nombre de contraintes ktDM 3 represente mieux le 
nombre d’operations elementaires. Le facteur k valant de moyenne 3100 
quand la densite est exprimee en pour cent. 

On peut rapprocher les nombres NfM 3 obtenus id de ceux que Wolfe 
donne [2] pour des problemes dont la dimension etait beaucoup plus faible 





12 JACQUES DE BUCKET 

(infdrieure) a la centaine. Wolfe indiquait un nombre d’operations de Tordre 
de 0,3 a 3 M 3 , nous avons ici de 0,08 a 1,6 M 3 . Cette concordance 
entre deux series de problemes tres differents et dont les temps de resolution 
sont dans un rapport de quelques centaines est assez remarquable. . 

Tableau 8.1 

Nombre Total d’ Operations /M s 


Nom du probleme 

A 

C 

E 

F 

N = nombre total d’operations 

2,95.10 9 

2,22.10° 

1,6.10° 

0,77.10° 

Nfd 2 M 3 

4200 

2600 

2000 

3500 

N/M 3 

1,62 

0,58 

0,18 

0,078 


9. ANNEXE 

On peut envisager de garder les coefficients de la matrice a raison de deux 
par mot machine. Chaque mot ayant 48 bits, un coefficient dispose de 24 
bits dont un pour son signe. II faut done exprimer tous les coefficients 
possibles avec 23 bits significatifs sans que le perte de precision par rapport 
aux dlements d’entree soit sensible. On a songe a une representation sous une 
forme virgule fixe, etant entendu que le plus grand de tous s’exprime avec 
23 bits, ceux qui sont plus petits auront alors moins de chiffres significatifs. 
On voit tout Tinteret d’une methode de cadrage, tous les elements etant tres 
voisins, ils s’expriment avec un nombre elevd de bits significatifs et la 
precision correspond a 10" 7 en valeur relative. 


10. REFERENCES 

[1] Saul I. Gass, Linear Programming Methods and Application , McGraw-Hill, New 
York, 1958. 

[2] Philip Wolfe et Leola Cutler, “Experiments in Linear Programming,” The 
RAND Corp., RM-3402-PR, Ddcembre 1962. 


EXPERIMENTS AND STATISTICAL DATA ON THE 
SOLVING OF LARGE-SCALE LINEAR PROGRAMS 

RfiSUMfi 

Given the expansion of linear programming which is expressed by the 
resolution of programs whose dimensions are increasing steadily and by trials 
that have been made on various methods of decomposition, it is interesting to 
see in what respect the classical simplex utilizing the product form of the 
inverse permits the practical resolution of such models. 




UNCONSTRAINED MINIMIZATION TECHNIQUE FOR CONVEX PROGRAMMING 13 


Among a certain number of linear programs resolved by the code 
OPHELIE on the CONTROL DATA 3600, a sample of seven problems from 
1221 rows up to 2617 rows was chosen. As a result of these problems, a 
certain number of tables have been compiled : 

— number of iterations necessary for the resolution, 

— the number of operations, 

— ratios between calculations made during an iteration, 

— numerical information on the inversions, 

— the ratio between the accumulated time of optimization and of 
inversions, etc. 

Also presented is the application of a scaling algorithm of the linear program 
matrix to such problems in order to reduce the dispersion of the magnitude 
of the coefficients. The efficiency of the simplex method for solving problems 
with large-scale dimensions has been pointed out. 


THE SLACKED UNCONSTRAINED MINIMIZATION 
TECHNIQUE FOR CONVEX PROGRAMMING 
La Technique de Minimisation arcec Variables d'Ecart 
ct sans Contrainte pour tine Programme Convexe 

A. V. Fiacco 

Northwestern University, Evanston , Illinois 
G. P. McCormick 

Research Analysis Corporation , McLean , Virginia 
United States of America 

1. INTRODUCTION 
The mathematical programming problem 

(A) minimize / ( x ), subject to £i(x) 0, /= 1, . . . , m, 

can be converted into the equivalent problem 

(B) minimize /(.r), subject to gi(x) — ft =. 0, 1, 

f “ 1, ... t 777. 

An attempt to solve problem (B) by applying the Sequential Unconstrained 
Minimization Technique [1], [2] and using the extension to handle equality 
constraints [3] would result in minimizing 

P[x, l , rt] == f(x) -rr t yL f r*-** 2 fo(*) - *i] s 


( 1 ) 




14 A. V. FIACCO AND G. P. MCCORMICK 

over positive values of t for a strictly decreasing null sequence of values {r*}. 

Solution of the mathematical programming problem by this method 
represents a quite different approach from the algorithm described in [1] and 
[2]. Briefly, the method embodied in (1) [as modified later in (2)] is an 
“ outside-in ” method instead of an “inside” method. Its implications along 
with other differences and similarities are discussed in Section 4. 

Although (1) was derived from the equality version of SUMT, a modi- 
fication is used instead to take advantage of the nonnegativity of the t's. 

The purpose of this article is to prove convergence of the following 
algorithm when (A) is a convex programming problem. Form the function 

P[x, t, r k ] =f(x) + rr 1 2 [gi(x) — *i] 2 - (2) 

i 

The algorithm proposed is to minimize (2) in nonnegative t for some r\ > 0. 
Starting from that minimum, r is reduced and P\x , t y r%] is minimized again. 
Proceeding in this fashion, it is necessary to show that the x’s that minimize 
(2) exist for every r^>0 and that they converge to solution of problem (A) 
as r^-j-0. 

Suggestions for this type of algorithm have already been made ([4], [5]). 
In [5],f quite independently, T. Pietrzykowski gave a proof for the conver- 
gence of an equivalent version of the SLUMT algorithm under slightly 
stronger conditions (he assumed compactness of the feasible region) than 
those imposed here. 

Under weaker conditions this article extends his results by proving the 
following : 

1. A minimum exists to P[x , t , r*] for every value of r & (not just for r* 
very small, as in [5]), 

2. A dual feasible point exists at every P-minimum yielding a lower 
bound on the optimum objective function value and the generation in the limit 
of the dual variables or shadow prices. 


2. PRIMAL CONVERGENCE 

In order to prove that SLUMT solves the programming problem (A) (here- 
after called the primal convex programming problem) use is made of the 
following conditions and lemmas. 

Cl. The functions /, —gi , . . . , ~g m are convex (hence continuous) 
functions of x. 

C2. Define R = {x ( gt(x) >0, i = 1, , . . , m}. Then assume that for some 
finite k , Rfc= {x \ f{x)<k\ x e R} is nonempty and bounded (hence compact). 

note. 1. This is weaker than assuming the feasible region (F) is bounded. 
2. C2 is equivalent to assuming that the set of #’s which solves (A) 
is nonempty and compact (see Lemma 1). 

t The authors are indebted to Professor Willard Zangwill for this reference. 



UNCONSTRAINED MINIMIZATION TECHNIQUE FOR CONVEX PROGRAMMING 15 


3. We are not assuming that R°= {a*|^(a*) >0, i=r 1, m} is 
nonempty, nor are we assuming that the Kuhn-Tucker constraint 
qualification [7, p. 483] is satisfied. 

4. Cl and C2 imply the existence of a finite vector x and a scalar i'o 
such that /(x) = Vo inf/?/(.v). 

Lemma 1. Let gi , i = 0, l,...,m be concave functions such that G = 
{.v | gfx) 0, 2 = 0, . ..,w} is a bounded nonempty set . Then for any set of 
finite perturbations {s* }, ivhcre £{ ;> 0, i = 0, 1, . . . , m t the set {.v \gi(x) > — £* , 
2 = 0, l ( ,..,w} is bounded . 


Proof Obviously it suffices to show that when so > 0 and s* = 0 for 
j=l,...,m then G t = {.v|£o(*)^ — £o, £f(*);>0» f = 1, m} is bounded. 

Assume the contrary. Then from any point x G e G [G is nonempty by 
hypothesis] a straight line emerges to pierce the boundary of G but not of Ge. 
Let .v <5 be a point on that line such that go(x<s) = — S < 0, g{(xfi) >0, : = 
1 From the concavity of g 0y — S = £o[*a] >A^o {x G + (1/A)(.t<5 — a*g)} 
+ (1 — A);o[*g]» where 0 < A < 1. 

Rewriting yields 


»{« c +i(«,-.vd}< ~ 5 ~ (1 7 AMrcl 


s 

A 


since 

go[x G ] 0, and 1 — A > 0. 

Taking the limit as A-*0 shows that a point on the line from xq passing 
through Xa takes on values of go smaller than —£o • Q.E.D. 

Lemma 2. Let {ei , . . . , e m } be an arbitrary set of positive numbers . Then 
for every finite k , {.v | /(.v) <C k, gi{x)^—€i y i=l, is bounded {hence 

compact). 

Proof. Make the correspondence £o(*) = —f(x)+k and the proof follows 
from Lemma 1 and condition C2. 


Lemma 3 . For ever y r > 0, any finite K y /;> 0, T r ~ {[at, t] | / (.v) + r" 1 X 

[gi{ x ) ”” U] 2 < AT} is a bounded set ( possibly empty). 

Proof. If 7Y = 0, the conclusion holds trivially. Otherwise it suffices 
to prove that the set of .v’s in T r is bounded. Assume the contrary. Let 
{[**, /*]} be an infinite sequence of points in which P[.r* , , r] <C K and 
such that limjt^ajlvi-j = oo. 

By Lemma 2 some g({xf)-^—oo as k-± oo. Thus for P to be bounded 
above {/(.\>)} must go to — oo. 

Let c > 0 be any positive number and 

R(t) = {x\f(x) < c = max (K, v 0 ) -f 1 ; gf x ) ^ — c for s = 1 , . . . , fit J. 

By Lemma 2, R(e) is bounded. It is not empty, since x is 
its interior R°(e). [Recall /(x) = Vo infn/(.v).] Consider potato ftr 
out in the sequence so that x^ ^ R{e) t f{x k )< ! vo. 



16 


A. V. FIACCO AND G. P. MCCORMICK 


Connect x and x k . Let x± denote the point on the line connecting x and 
Xfc which intersects the boundary of R(e). Let A 0<A*<1 be the scalar 
such that A* x k + (1 — A*)x = xi . [A* ^ 0, since x e i?°(e), x\ is on the boun- 
dary of R(e)]- 

By convexity (and the selection of x k ) 

f[x x] < A kf[xk] + (1 - A k )f(x) < v 0 . ( 3 ) 

Thus f[x j] <c, implying that x x must be on the boundary portion of R(e) 
where £f[#i] — — e for some i. 

For convenience consider that i is 1, that is, gi(#i) = — e. 

Define ^i=minR( e )/(x), which exists since R{e) is compact. Using this 
and the convexity inequality obtained in (3), we have that 


JV X ^ /(* i) - (! - A *)/( x ) ^ ®i — (1 — Afc)t >0 ^ ®i— ®o 
f(xk) ^ A* ^ A^ + 


( 4 ) 


From the concavity of the gi(x) and the fact that ^(x)>0, 0>— e= 
gi(xi ) > A* gi(x k ) + (1 — At) g!(x) > At gi(x k ), giving 


g l 2 {Xk) 


, for any r > 0. 


Summing (4) and (5) gives 


Vl — VQ £ 2 


< / (*A;) + 


gl 2 {Xk) 

r 




r 


for all t x >0, since £1 (##)<(), 


< /(**) + 2 - f - ]2 for all ti> 0 


= P[x k , t, r], t> 0, (r > 0). 

This inequality will be used to force a contradiction. 

Let A* be a number such that for 0 < A < A # , 

vi—vo , , e 2 

~T~ +Vo + r)fi >K ' 


Consider k>k* where for k>k* f A*<A*, whereas x x and x k are selected 
to satisfy the construction given above. Since x x is taken from a compact set, 
and lim^_,co|^| = lim^oolx + (1 /Aa;)(a:i — x)| = oo, clearly this can be done; 
but from the above inequality this immediately implies that P[x k , /, r] > K , 
for all t > 0, all k > k*. This contradicts the assumption that P[x k ,h,r\<K 
for the infinite sequence {[**, t*]}, where t k ^ 0 and lim*^co|#A:| = co. Q.E.D. 



UNCONSTRAINED MINIMIZATION TECHNIQUE FOR CONVEX PROGRAMMING 17 

Lemma 4. 

(a) The SLUMT P -function (2) is, for each r* > 0, minimized by a finite point 
[x{7>), /(r*)] (not necessarily unique). 

(b) The set of [x, /] at zchich the P- function can be minimized for any rt (rchere 
0 < rt < ri) is contained in a bounded set that depends only on r\ . 

Proof. A uniform upper bound is available on the possible minimum 
values of each P[x , /, r* ]. Let x be a point solving (A) and Vo — /(x). Then 

inf P[x, f, r h ] < inf {/ (x) rf 1 2 [£((*) — *i] 2 } (6) 

(all x, find all f>0) (*>0) 

= f(x) — v o (where L is set equal to gt(x)^ 0). 

Let 

T k = ((x, t]\f(x) 4- rr 1 2 [gi( x ) - ti) z < W *>(>}, for A = 1, 2, ... . (7) 

By Lemma 3 each Tt is bounded. This proves (a). Since ri>rjt, 
Tt c 7i. This proves (b). 

Lemma 5. If (x}f y t}f) and (xk 2 , It 2 ) both locally minimize P[x, t, rt], 
then P[x k \ tk 1 , rt\~ Pfet 1 , tk 1 , Tk\* (/« effect every local minimum is a global 
minimum). 

Proof. This follows from the fact that for fixed x the ith component of 
the minimizing t is given as f{— max[G, £<(*)]. Thus each term in the P 
function involving g{(x) takes the form (|[gf(*) ~~ \gi{ x )\] 2 }, which is a convex 
function when g(x) is concave. All local minima to a convex function arc 

global minima. Q.E.D. 

Theorem l ( Primal Convergence). Let {r*} be an infinite sequence of positive 
numbers such that rt > nt+i > 0, and limk^<x>rk~ 0. Under conditions Cl and 
C2, for any Tk > 0, the function P[x, /, rf\ is minimized in the region in zchich 
t 0 by a finite point [*(rjr), /(r*)] ( not necessarily unique ). Further, 

lim r £ -i 2 IgiMn)] - U (n) P = o, (8) 

JL -t- os i 

lim P[x(rt), t{r k ), r*] = v 0 , (9) 

k-+ co 

lim/[x(r*)] = t' 0 ; (10) 

jb-+ co 

that is, every limit point of the uniformly bounded sequence {[*(/*)]} solves 
the convex programming problem (A). 

Proof. The finiteness of each [x(rjk), t(r*)] and the uniform bound on 
the region containing {[x(r*), /(r*)]} is given by Lemma 4. 

Let t;j — inf f(x) t where T\ is as defined in Lemma 4 (7). (11) 



18 A. V. FIACCO AND G. P. MCCORMICK 

Combination with (6) yields, 

vq > P[x{tjc) y t{r k \ r k ] > V\ , (12) 

and each term of P is uniformly bounded below. Let ** be a limit point of 
the uniformly bounded sequence {x{r k )}. Then x* is primal feasible (i.e., 
gi(x*) > 0, all i). Otherwise P->+ co. If (8) did not hold, then f(x*)<v 0l 
contradicting the fact that = i nf Xe jif(x). Hence (8) holds and by the same 
reasoning (9) and (10) hold. Q.E.D. 

Note that when x(r k ) is ascertained, t(r k ) must be given by 

ti(r k ) — max{0, gi[x(r k )]}, i = 1, • . . , m. (13) 

3. DUAL CONVERGENCE 

Consider the following programming problem (called the dual problem). 
In a slightly different form it was originally formulated by Wolfe [6] (who 
used a differentiable form). 

maximize {G[x, u] = fix) — ]> tl tgi{x)} 

x,u 

subject to G[x , u] = inf G[£, u] (14) 

f 

iH > 0, * = I , . . . , w. (15) 

[Denote D as the set of (x> it) satisfying (14) and (15).] 

It will now be shown that SLUMT solves the dual as well as the primal 
convex programming problem. In particular, for every r k > 0 any [#(r*), 
t(r k )\ that minimizes the slacked P-function yields a dual feasible point. The 
importance of this is given by the following lemma. 

Lemma 6, Any point that is primal feasible yields a higher objective function 
value than any dual feasible point: that is , let y £ R and [x°, u°] satisfy (14) and 
(15). Then 

(fy)>f(* 0 )- 2 ulgiiA- 

Proof. 

f(y) >f(y) - I «igi{y) [since y e R (15)], 

(f(f)-Z «?&(£)] 

= f( x °)-2*'igi(x 0 ). Q.E.D. 

Lemma 7. Define ut*{r k ) = -2r k ^{ Si [x(r k j] - ti(r k )}, z = l, Then 

[x{rk)y u{r k j\ is dual feasible [i.e., satisfies (14) and (15)]. 

Proof. For notational convenience x k denotes x(r k ) and ui k denotes 
Ui k (r k ). Let y be any point. Let z be a point on the line segment connecting 
x k and y and let 6 be an associated, positive number (0<#<1) such that 
z = x k + 6[y~x k ]. 



UNCONSTRAINED MINIMIZATION TECHNIQUE FOR CONVEX PROGRAMMING 19 


By convexity' 

f(y)>f(n) + 0-^)-f(n)}. 

(16) 

By concavity 

gt(y)<gt(xk) + 0- l [gt(z) -gt(xk)]. 

(17) 

Because [**, /*] minimizes P[x ) t> r*], 


f[Xk] + T);- 1 

2 [site) - u’-r- < m + r*-i 2 [site - U(z)Y, 

(18) 


t i 


where /<(~);>0 t i=l,... f in. By using (16) and (18) 

m-fteteo-'vw-fi**)! 

^ ( On )~ l {2 fete) - 1 ( *¥ - 2 [g ( (z) - *(*)]*}. (19) 

i » 

It is useful at this time to divide the constraints into two classes, those for 
which gt(z) < 0 and those for which g{(z) 2> 0, for all 6 such that 0 < 6 < 6\ < 1. 
Such a 6 1 exists because of the concavity of the constraints. Rearrange the 
constraints so that the first B is in the first category. 

Because of (12), ^*‘=0, ut k = — 2r*“ 1 gifak), for 1=1, and 

tt k *=gi(xit), uf = 0, for i = B + 1 , . . m. Thus (15) is satisfied. Now let 
tt(z) = 0, t=l, and 

*«(~) — £<(~)> for i = B + 

Consider 6 to be smaller than 0\ (and >0) so that 


£<(a)<0, /= 

and 


£<(~)^0, i = B + l, 

Then (19) becomes 


/(>’) —/(* t) ^ (On)- 1 2 E^iC^Jt) -£<(=)][site) -r£<(=)]. (20) 

t«»l 

Since gifa) -bgi(z) <0 (for the range of 6 considered), (17) may be 
substituted in (20) to yield 

/O’) —/(**) ^ n- 1 2 fete) -gi(y)][g<( x k) +gi(=)]- (21) 

i«“l 

Now, taking the limit as 0 and rearranging (^r — ^^rjt), 

f(y) - 2 wi^(O’) S:/te) - 2 «ifete)- ( 22 ) 

i-1 *-l 

Because j/f* = 0 for / = B -f 1 , . . . , m, (22) shows that ** is the minimum of 
the Lagrangian for those choices of v, that is, (14) is satisfied. Q.E.D. 



20 


A. V. FIACCO AND G, P. MCCORMICK 


Novr a modification of the basic duality theorem for convex programming 
can be proved by using the convergence properties of SLUMT. 

Theorem 2 ( Dual Convergence). When f(x), —gi(x), i — 1> — ,771 are 
convex functions (Cl) and the set of points that solves the convex programming 
problem (A) is compact (C2), then there exist points that are dual feasible such 
that 

v 0 — MpJ(x) = sup v[f(x) — 2 u ‘ gi( x )]> 
where D = [x, u] satisfying (14) and (15), 

Proof. Each [#(?>), u(r k )] (as defined in Lemma 7) is a dual feasible 
point, where 

G[x(r k ), u(r k )] =f[x(r k )] -f 2 — < ^0 - (by Lemma 6) 

Use of (8), (10) and (13) yields the result that limber, G\x(T k ), tt(r*)] = e 0 , 

Q.E.D. 

The theorem (Wolfe [6]) is usually stated that there exists [x, u] dual- 
feasible for which G[x,u] = vo- Hi s proof requires the Kuhn-Tucker con- 
straint qualification ([7], p. 483) to be valid. The difficulty this avoids is the 
existence of “ infinite ” dual variables/}- There is no assurance in our 
algorithm that lim^-,®^^) is finite without additional requirements on the 
constraint region; for example, there is no finite u for [x, u] to solve the dual 
of the primal programming problem minimize x subject to — x 2 > 0, although 
both Cl and C2 are satisfied by this problem and the solution using SLUMT is 
obtained -without difficulty. 


4. DISCUSSION 

The SLUMT algorithm is not a trivial variant of the Sequential Uncon- 
strained Minimization Technique (SUMT). The differences are listed as 
follows: 

L SUMT is primal feasible for the inequality constraints at every 
iteration, whereas SLUMT approaches the optimum from the infeasible 
region (i.e., it is a relaxation technique). 

2. At any time during minimization, since SLUMT is concerned only 
with those constraints that are infeasible, the amount of work required per 
move in attempting to minimize the P-function is considerably less; for 
example, if the optimum is unconstrained, SLUMT will find it at the first 
r minimum. 

t It can be shown in a manner analogous to the proof (in Fiacco and McCormick [3]) 
that if the set R° — {x\gi (x) > 0, i — 1, . is nonempty the limiting values of ui(n) 
are finite, * = 1, . . . , m. 



UNCONSTRAINED MINIMIZATION TECHNIQUE FOR CONVEX PROGRAMMING 21 


3. SLUMT docs not require an interior to the inequality constrained 
region and, in particular, the Kuhn-Tucker Constraint Qualification need 
not be satisfied. 

4. There is no separate feasibility phase” required before optimization 
takes place as in SUMT. 

The similarities are many. Both are dual methods. The same extrapola- 
tion properties apply (see [3]) in SLUMT and are developed elsewhere. Both 
can be used by placing only the nonlinear constraints in the P-function and 
maintaining the linear inequalities as side conditions. This enables both to 
be combined with algorithms for minimizing a nonlinear function subject to 
linear inequality constraints. 

Current work in implementing this for use on a high-speed computer is 
reported in a later article. 


5. REFERENCES 

[1] A. V. Fiacco and G. P. McCormick, “The Sequential Unconstrained Minimi- 
zation Technique for Nonlinear Programming, A Primal-Dual Method,” Manage- 
ment Science , 10(2), 360-366 (1964). 

[2] A. V. Fiacco and G. P. McCormick, “Programming under Nonlinear Constraints 
by Unconstrained Minimization: A Primal-Dual Method,” Research Analysis 
Corp., McLean, Virginia, RAC-TP-96, September 1963. 

[3] A. V. Fiacco and G. P. McCormick, “The Sequential Unconstrained Minimi- 
zation Technique for Convex Programming with Equality Constraints,” Research 
Analysis Corp., McLean, Virginia, RAC-TP-155, April 1965. 

[4] Glen D. Camp, “ Inequality-Constrained Stationary -Value Problems,” OHS A 3, 
No. 4, 548-550 (November 1955). 

[5] T. Pietrzykowski, “Application of the Steepest Descent Method to Concave Pro- 
gramming,” Proc.JFIPS Congr ., Munich, 1962, North Holland, Amsterdam, 185-189 
0962). 

[6] Philip Wolfe, “A Duality* Theorem for Nonlinear Programming,” Quart . AppL 
Math., 19 (3), 1 (1961). 

[7] H. W. Kuhn and A. W. Tucker, “Nonlinear Programming,” Prcc. 2nd Symp. 
Math. Stat. Proh., J. Neyman (Ed.), University of California Press, Berkeley, 1951, 
pp. 481-493. 


LA TECHNIQUE DE MINIMISATION AVEC 
VARIABLES D’ECART ET SANS CONTRAINTE 
POUR UN PRO GRAMMATION CONVEXE 

RfeSUMfi 

On presente un algorithme pour resoudre le probleme: Minimiser f(x) (une 
fonction convcxc) sous les contraintes £{(*);> 0, i«/ f ...,m (chaque g{ est 
une fonction concave). Pour ctre precis, on minimise la fonction 

P[x. t, r t \=f(x) -f r*-i 2 fo(*) - <i] 2 



22 C. It. C. MET Z, R. N. HOWARD, AND J. M. WILLIAMSON 

parmi les x, t, ( t > 0) pour une suite (r*) tendant vers 0 en decroissant stricte- 
ment. Ceci etend le travail de t. Pietrzykowski (comptes-rendus du Congres 
IFIPS, pp. 185-189, Munich, 1962, North Holland Company, Hollande, 
1962). On demontre que, pour tout r*, il existe un point a distance finie 
x{rk), t{ric) qui minimise P, et qui donne une solution du probleme convexe 
lorsque r*-^0. Cet algorithme est semblable a la methode SUMT (Sequen- 
tial Unconstrained Minimization Technique, Management Science , 10 (2), 
pp. 360-366, 1964) en ce qu’il resoud le probleme dual (au sens de Wolfe). 
II differe de la methode SUMT en ce que: 

— il tend vers Poptimum par la region de non-realisabilite (autrement 
dit, il s’agit d’une technique de relaxation), 

— il n’exige pas que la region definie par les contraintes non lindaires 
posskde un intdrieur non vide, 

— il n’exige aucune phase a part destinee a obtenir une solution 
realisable, 

— les calculs necessaires a la realisation du probleme sont reduits dans 
une proportion considerable. 


APPLYING BENDERS’ PARTITIONING METHOD TO 
A NONCONVEX PROGRAMMING PROBLEM 

Applicatio7i de la Methode de Partage de Benders 
a un Probleme de Programmation Non-Convexe 

C. K. C. Metz,| R. N. Howard,! and 
J. M. W illiamson § 

United Kingdom 


1. INTRODUCTION 

Ironworks management has a wide range of plants, raw materials, and operating 
practices from which to choose when planning ironmaking operations, and the 
determination of an optimum choice for a given set of circumstances is 
virtually impossible with traditional methods. A mathematical model of the 
ironmaking and auxiliary processes that constitute an ironworks has been 
assembled in which the processes and their interactions are represented by 
equations. This model is being used to calculate the optimal practice 
satisfying specified requirements. 

f Formerly of the Operational Research Department, BISRA, London, now of the 
Cape Asbestos Company Limited. 

X Formerly at the London School of Economics, now at the University of Pennsylvania. 

§ Operational Research Department, BISRA, London. 



benders’ partitioning method in nonconvex programming 


23 



In mathematical terms, the problem of choosing the best ironmaking 
practice becomes that of finding the global minimum of a nonconvex cost 
function subject to a number of equality and inequality constraints, some of 
which are also nonconvex. The procedure described in this article involves 
the reformulation of the problem in terms of mixed variables and the use of 
a partitioning procedure in order to arrive at a solution. The method is based 
on Benders’ approach to solving mixed-variables programming problems [1] 
and on a comment made by Beale [2], 

The article also includes a discussion, in turn, of the mathematical 
formulation of a nonlinear program in terms of mixed variables, a method 
of solving the problem in this form, and of some refinements of the procedure. 


2. THE MATHEMATICAL FORMULATION OF A NONCONVEX 
PROGRAMMING PROBLEM IN TERMS OF MIXED VARIABLES 

2.1. The Problem 

The programming problem considered is that of minimizing a nonconvex 
cost function Z subject to a set of constraints, the majority of which are 
already linear. However, the procedure of reformulating the original problem 
as a linear one in mixed variables, developed in this section, may find applica- 
tion in more general situations. 

2.2. Piecewise Linear Approximations 

The first step in the reformulation of the problem is to replace the original 
nonlinear functions by approximating functions. Piecewise linear approxima- 
tions to the nonlincarities arc defined by introducing sets of special variables 
(A’s) as in separable programming [3], 

Figure 1 show’s the approximation of a nonlinear function f(x) by a piece- 
wise linear function over the range a\ <x<a$. Wherever it occurs /(x) can 
be replaced by the approximation 

/(*)= i a, •/(«,). 

(«i 


24 C. K, C. MHTZ, K. X. HOWARD, AND J. M* WILLIAMSON 

where 

x=2X rat 
1-1 

and 

l = ZAi 

i — 1 

if the new variables Af are non-negative and only adjacent pairs, at most, are 
simultaneously nonzero. This amounts to no more than approximating /(x) 
by linearly interpolating between adjacent points on the original function. 
The error of the approximation can be minimized by choosing the number 
and the relative position of the linear sections. 

It is important to note that the replacement of nonlinear functions by 
piecewise linear approximations introduces more equations and variables 
into the problem. 

A typical version of the problem considered at BISRA involves 265 
equality and inequality' constraints, 225 specified variables, and 25 nonlinear 
functions. In making piecewise linear approximations to the nonlinearities, 
a further 125 variables and 25 equations are necessarily introduced; in addi- 
tion, equations are introduced to define variables of special interest explicitly 
2 nd to facilitate the calculation of the coefficients of the equations. The new 
formulation of the problem has 350 linear constraints in 450 specified variables. 

2,3. Convex Subregions 

The use of piecewise linear approximations to nonlinear functions is not, 
however, the complete answer to the problems of nonlinearity'. As is stated 
above the special variables (A’s) have to be constrained to correspond to 
approximations to the functions. Only adjacent pairs of A- variables can be 
simultaneously nonzero if at all times the function f(x) in the equations is 
to lie on one of the linear segments. 

In cases in which the nonlinear functions are wholly or partially convex 
it is possible to relax this constraint; it is necessary only to ensure that the 
total feasible region of the problem is not changed in its new formulation, 
not that f(x) must at all times lie on a linear segment; for example, the con- 
straint xy > c can be written 

*= f, 

1=2A ( , 

without any' further condition on A, other than that of non-negativity. 

In general it is possible to define subsets of a set of special variables, 
which may be simultaneously nonzero. At the verv least, these subsets will 
constitute adjacent pairs of A-variables (in the case of strict concavity); £t 



benders’ partitioning method in nonconvex programming 25 

best, the only subset will constitute the whole set (in the case of strict 
convexity). 

The choice of one such subset from each set of special variables, together 
with the ordinary variables of the programming problem, is said to define a 
convex subregion. 

2.4. The Mixed-Variables Problem 

To describe the distinct convex subregions mathematically integer vari- 
ables are introduced. To any set of special variables Aj, s= 1, 2, . . . , S f which 
includes m subsets corresponding to distinct convex subregions, allocate in 
integer variables d] ;> 0 such that 

m 

ld } = 1 

i - 1 

and such that the choice of dj— 1, (dj = 0, j ^ J) t corresponds to the choice 
of the /th subset of (A 5 ). Then, if A r belongs to subsets Ji , Jz , • * * > and Jq , 

A r ^ dji dj t -}-•••-[- d J<t ; 

that is, if any one of the subsets /i, /a, or Jq is chosen, A r can be non- 
zero, but the choice of any subset to which it does not belong (other than J \ , 
Jo,..., or J q ) necessarily means that d Jl ***dj q are all zero and A r — 0. 

Thus the formulation of the nonlinear programming problem in terms of 
mixed variables is as follows: 


minimize (Z — c T • x + C T • A) 

(i) 

x, A 


subject to the constraints 


b = A • x + B • A, 

( 2 ) 

ei= Hi* A, 

( 3 ) 

0= I-A-J-A + I-S, 

( 4 ) 

eo = Ha • A, 

( 5 ) 


where Z is a scalar, 

c, C, b, ei , C 2 , x, A, A, S, are column vectors, 

x, A, A, S, all being non-negative, S being a vector of slack variables, 

C r = ( c lT > c 2r , . . . , c nT ), 

A5 T = (A^,A2r...,A« r ), 

X* is a column vector, the elements of which are a set of special 
variables, 

Ar ==: (dir dsr d»T) f 

d* are column vectors of integer variables corresponding to subsets 
of elements of X*, 

Q\ and C 2 arc column vectors whose elements arc all unity, 

A, B, Hi, I, J, Ho are matrices, I being the identity matrix- 
If an element of X* is AJ and of d* is dj, (3) and (5) simply state that 

y A/= 1 and 2 df — 1, for each u 
* j 



26 C. K. C. METZ, R, N. HOWARD, AND J. M. WILLIAMSON 

This is an ordinary linear program defined by the expressions (1), (2) and 
(3), subject to the extra constraints [expressed by (4) and (5)] that only certain 
subsets of the elements of A can be simultaneously nonzero. Any value of 
the vector A satisfying (5) can be interpreted as corresponding to a specific 
convex subregion, and the phrases “a value of A” and “a convex sub- 
region” are to be regarded as interchangeable. 


3. THE METHOD OF SOLUTION 
3.1. Outline of the Procedure 

In this section a method is given for solving the mixed- variables program- 
ming problem defined in Section 2.4. In the search for a global optimum 
solution by any method that finds only local optima it is necessary, in effect 
at least, to consider every one of all the possible convex subregions. Only 
in this way can it be assured that a global optimum will be obtained. In the 
procedure given here subregions are systematically selected and the program, 
which is linear within a convex subregion, is solved to obtain a local optimum. 
The selection of a subregion for investigation is effected by solving a pure 
integer program. Figure 2 describes the basic procedure as a flow diagram. 


START 



j. Yes 

TERMINATE: A global optimum 
has been obtained. 


Figure 2 










BENDERS* PARTITIONING METHOD IN NONCONVEX PROGRAMMING 27 

The basis of the procedure is the partitioning (following Benders [1]) 
of the variables into two classes — integer (A) and noninteger (x, A). Sub- 
sequently, the optimal values of x, A are determined for a fixed A. Then a 
better A is obtained by using the locally optimal values of x, A. Repetition 
of this process leads progressively to a global optimum, as proved in 
Section 3.4. 

The remaining sections of Chapter 3 discuss in turn the steps of the 
procedure above. 

3.2. Step 1. The Linear Program in a Convex Subregion 

The choice of a particular convex subregion is effected by the choice of a 
value of the vector A of integer variables satisfying (5) of Section 2.4. It 
further corresponds to fixing at zero values those A-variables that do not cor- 
respond to this convex subregion, as described by (4) of Section 2.4. The 
linear program in a convex subregion therefore corresponds to the problem 
defined by relations (1), (2) and (3) in which certain of the elements of A are 
constrained to be zero. 

In the procedure for carrying out the linear programming calculations 
the computer programming technique of “flagging” variables is used. This 
obviates the need to include the integer variables A and the host of associated 
constraints described by (4) of Section 2.4 in the program. Subsets of A 
which are constrained to be zero are “flagged”; that is, in the computer 
program a marker is associated with them. Subsequently they are operated 
on just as if they were ordinary artificial variables which must be nonbasic 
in any feasible solution, whatever their associated costs. 

In trying to solve the linear program in a convex subregion there can be 
one of three possible outcomes: 

1. The solution might be unbounded. 

2. A feasible and optimal solution might be obtained. 

3. The linear program may prove to be infeasible. 

If the solution is unbounded, the original problem is unbounded, since 
the linear program in a convex subregion is more constrained than the 
original problem. 

If a feasible and optimal solution is obtained, the coefficients of the 
nonbasic variables x s have non-negative costs in the reduced form of the 
objective function. At this point the objective can be expressed as follows: 

Z = /To + 2 c* * x s + 2 * eg 1 + * * * + 2 

* * * 

where the coefficients c s are all non-negative. 

If the linear program proves to be infeasible, at least one constraint will 
have been transformed in the course of the calculations to the form 

o « + 2 V • + 2 W + — + 2 w, 

9 t # 

where the coefficients b s arc all non-negative. 



28 C. K. C. METZ, R. N. HOWARD, AND J. M. WILLIAMSON 

3.3. Step 2. The Deduction of Constraints on the Global Optimum 
Solution and on the Area of Search 

In this section it is shown that the result of attempting to solve the linear 
program in a convex subregion permits either the construction of bounds 
on the global optimum solution or a limit on the feasible region. 

If an optimal solution to the linear program in a particular subregion is 
obtained, the objective function can be expressed as in Section 3.2 with c g > 0 
for all r. Because the variables x $ and are all non-negative in any feasible 
solution and because 2s^s i==5 1 f° r eac ^ h it follows that 

Z>Zq + minimum (2 2 ^ 9 c s { ). 

i s 

Hence, in the convex subregion defined by the choice of the j % th subset of 
elements from each 7J, the others being fixed at zero, 

z > Zq - j - 2 a jii 

i 

where 

a h s ej(t h subset ^ 8 '* 

Let dj* be the integer variable associated with the choice of the jth. subset of 
elements of as described in Section 2.4. Then the inequality can be written 

Z 2> Zo + 2 • dj^ T* ' ' * ~h 2 a f n * dj n . 

3 3 

This expression defines a lower bound on Z, hence on the optimal solution, 
in any convex subregion. What is more, Z = Zq is the optimal solution in 
a subregion and is thus an upper bound on the global optimum solution. 

If the linear program in a particular subregion proves to be infeasible, 
at least one constraint will have the form given in Section 3.2, where £*>0 
for all s. By similar argument to that given above this can be written 

0 ^ T 2 a S^ * T ’ ” 4~ 2 a J n * dj n . 

3 j 

This inequality defines a limit on the feasible region; any set of dj* satisfying 
(5) of Section 2.4 that does not satisfy the above inequality corresponds to 
a subregion in which the linear program will be infeasible. 

3.4, Step 3. The Pure Integer Program 

In Section 3.1 the method of finding a global optimum is said to involve 
the systematic selection of convex subregions within which to solve the linear 
program. In this section it is shown that the selection of the next subregion 
to consider at Step 3 of the procedure is a pure integer program. 

By the time Step 3 of the procedure has been reached at least one linear 
program will have been attempted and at least one constraint, of either of the 



BENDERS* PARTITIONING METHOD IN NONCONVEX PROGRAMMING 29 

types given in Section 3.3, will have been constructed. At a later stage in the 
operation of the procedure there will be, in general, several constraints of 
each of the two types. 

These constraint sets will have the form 

o^b ,, r 2 a Jr l -d,'4 h T a/r" ’ dj n , r—\,2,...,R, 

i i 

Z> Zt, p r 2 a ip n ‘ d ! n > p—R~l,...,P. 

) ) 

In selecting the next convex subregion to consider it is clearly sensible 
to choose the subregion within which the global optimum solution is most 
likely to occur. Thus the subregion selected is the region that satisfies the 
first R inequalities for which the greatest lower bound on Z is least. This 
problem of selecting the subregion, a pure integer programming problem in 
0-1 variables, is further discussed in Appendix 1, in which the method being 
adopted to solve it is given. 

The values of dj* for which this pure integer program is feasible include 
those for which the original mixed-variables problem is feasible. Therefore, 
if the integer program is infeasible, the original program is too. 

3.5. Step 4. Test for the Termination of the Procedure 

In this section a few observations are made about the convergence of the 
procedure and the criterion for terminating it is established. 

Because the global optimum is less than or equal to any local optimum, 
the (local) optimum solution in a convex subregion provides an upper bound 
on the global optimal solution. The least of all the locally optimum solutions 
obtained at any stage of the procedure is thus the current best upper bound 
on the global optimum. 

The solution to the pure integer program at any stage defines the current 
best lower bound on the global optimal solution. 

The whole procedure must converge and terminate because the solution 
of the linear program in a convex subregion provides the exact lower bound 
on Z as well as an upper bound on the global solution, and the number of 
subregions is finite. The procedure terminates when the solution of the 
integer program, the current best lower bound, is equal to the value of the 
best linear program solution obtained, the current best upper bound. The 
linear program solution is thus proved to be a global optimum. (To find all 
globally optimum solutions the procedure can be continued until all solutions 
of the integer program have been investigated by linear programming.) 

Each time the integer program is solved it has more constraints than at 
the last time of solving it; the solution obtained will therefore be a mono - 
tonically increasing function of the number of integer programs solved. 

On the other hand, although a given subregion might appear to be par- 
ticularly promising in that the lower bound on the solution is very low there, 
the linear program in this subregion might have an optimum solution greater 
than those of previous linear programs or it might even be infeasible. 



30 C. K. C. METZ, R. N. HOWARD, AND J. M. WILLIAMSON 

It therefore follows that although a local optimum solution, obtained at 
an early stage, may prove to be the global optimum it is possible to be sure 
of this and to terminate the search only when it has been shown that a better 
local optimum does not exist. 

4. REFINEMENTS OF THE BASIC PROCEDURE 

4.1. Increased Efficiency of the Procedure 

In principle, the basic procedure is a search through all possible values of 
A to find that value for which Z can achieve its global minimum. The 
method proposed for selecting A at each stage is to choose the solution of 
a pure integer program as described in Section 3.4. However, the real prob- 
lem of the over-all procedure is to minimize the over-all computing time. 

Because of this, a number of refinements of this basic procedure, which 
are likely to increase the efficiency of the procedure and to reduce the com- 
puter time required, are being considered. These refinements are discussed 
in the following sections. 

4.2. Separable Programming 

The first of these refinements is that of separable programming. Instead 
of solving the linear program for a fixed value of A and then solving the 
integer program for the next value of A, it is likely to be more efficient to 
obtain a local optimum by separable programming, starting from the specified 
subregion (A), before solving the integer program for the next A to consider. 
This is because the method of separable programming ensures that a pro- 
gressively better solution will be obtained, if possible, by moving from one 
subregion into an adjacent one. The solution is thus improved without the 
expense in computing time of solving the integer program. Furthermore, 
the sooner a good solution (i.e., close to optimal) is obtained, the more 
efficiently will areas be excluded from further consideration in which the 
solution is far from optimal. 

Each linear program of Step 1 of the procedure can thus be a separable 
program starting in the specified subregion. In this way the value of A, 
instead of being fixed, may be changed to the value corresponding to an 
adjacent subregion in a separable programming iteration. However, to 
ensure that the separable programming will not lead to a local optimum pre- 
viously obtained, certain restrictions must be placed on the operation of the 
separable program. 

The first restriction is to use the solution procedures of linear program- 
ming within the convex subregion until an optimum solution is obtained or 
infeasibility is demonstrated. Only then is a separable programming iteration 
considered. In this way the region to which the integer program directs the 
procedure, and subsequent regions, are explored exhaustively before moving 
on. 

A necessary further restriction in the case of feasibility within a subregion 
is to test the value of the objective function at a separable programming 



BENDERS* PARTITIONING METHOD IN NONCONVEX PROGRAMMING 31 

iteration against the current best locally optimum solution. If it is an 
improvement on the best value so far, a separable programming step into an 
adjacent subregion is, if possible, carried out; otherwise a constraint is added 
to the integer program. The integer program is then solved for the next 
subregion from which to start a new separable program. 

4.3. Multiconstraint Generation 

A second refinement is to use the fact that a number of constraints can, 
in general, be constructed in the progression toward a local optimum by 
linear or separable programming procedures. 

Any solution of the program for fixed A, not only the optimal, for which 
the reduced form of the cost function is as discussed in Section 3.2, can be 
used as in Section 3.3 to generate a constraint. Also, at any separable 
programming step of the type described in Section 4.2 a constraint can be 
generated. 

The use of a number of additional constraints derived from such 
intermediate solutions will enable the integer program to determine more 
quickly the optimum subregion, and to rule out the remainder. 

4.4. Suboptimization in the Integer Program 

A third refinement possible at Step 3 of the whole procedure is that of not 
solving the integer program exactly but only obtaining a more promising 
solution than the best so far investigated. 

There are at least two ways of obtaining such solutions to the integer pro- 
gram (sec Appendix 2) which are not necessarily optimal. One is simply 
a systematic procedure of finding the solution to the integer program 
(Appendix 1) terminated before optimality is assured, by ending the solution 
procedure after a given time. The other is a systematic and speedy improve- 
ment of a previous solution to the integer program. 

4.5. Limitation of the Feasible Region from Prior Considerations 
and Preplanned Search 

In some mathematical programming problems it is possible to assert at 
the outset that certain regions are not of interest. Because of the simple inter- 
pretation of integer variables, it is relatively easy to insert constraints man- 
ually in the integer program which limit the area of search. 

In addition, the whole procedure can be started by a search of the most 
promising areas first if expert knowledge of the underlying physical problem 
can define such areas. This is effected by simply specifying a succession of 
subregions (values of A) and by solving in each case the resulting linear 
program. Subsequent to this, selection of A can be by integer program. 

5. THE DEVELOPMENT OF COMPUTER PROGRAMS 

Work is in progress at BISRA in developing computer programs for the ICT 
1905 computer. Initially, these programs will be capable of tackling linear 



32 C. K. C. METZ, R. N. HOWARD, AND J. M. WILLIAMSON 

programs of as many as 512 constraints in 512 specified variables and integer 
programs of as many as 100 integer variables. 

The programs will include facilities for separable programming. The 
use of a promising feasible solution to the integer program (Section 4.4) 
instead of the optimal solution and the use of multiconstraint generation 
(Section 4.3) will also be included in due course. 

6. CONCLUSIONS 

The method described in this article appears to have considerable promise 
as a method of nonlinear mathematical programming. This procedure, allied 
to separable programming procedures, is expected to ensure that the global 
optimum to a nonconvex problem can be found efficiently in a systematic 
search. 

Experience with the use of the method up to January 1966 has been 
encouraging. A problem involving 53 constraints in 83 variables, needing 
13 integer variables, was solved in 12 cycles of an earlier version of the pro- 
cedure described here; that is, 12 linear programs were solved to obtain the 
global optimum. Complete enumeration would have involved 75 linear 
programs. 

Although the procedure has been developed in order to solve a particular 
operational research problem in ironmaking, it is thought that the method 
will be applicable in other fields. In principle, any nonlinear optimization 
problem can be formulated and solved in the manner described in this article. 
The value of this approach in a particular case will depend to some extent on 
the degree of complexity of the nonlinear terms, hence on the number of 
additional variables and equations needed for the piecewise linear approxi- 
mations. Nevertheless, the procedure guarantees that a global optimum 
solution will be obtained, and this is an advance on most methods of solving 
nonlinear optimization problems. 

7. ACKNOWLEDGMENTS 

The work of Benders and of Beale has been referred to in the text; in 
addition, their contributions in informal discussions of the work reported in 
this article are acknowledged. Due thanks are also expressed to Mr. 
R. H. Fox of BISRA, who is writing the bulk of the computer programming 
system that will carry out the procedures described in this article, and who 
has contributed to the planning of the system. 


8. REFERENCES 

[1] J. F. Benders, “Partitioning Procedures for Solving Mixed-Variables Programming 
Problems,” Numerische Mathematik , 4, 238-252 (1962). 

[2] E, M. L. Beale, “A Survey of Integer Programming,” Operations Res. Quart., 16, 
No. 2, 219 (1965). 

[3] G. B, Dantzig, Linear Programming and Extensions , Princeton University Press, 
1963, pp. 483, 484. 


BENDERS* PARTITIONING METHOD IN NONCONVEX PROGRAMMING 


33 


APPENDIX I 

THE METHOD OF SOLUTION OF EACH PUKE INTEGER 

PROGRAM 

The pure integer program solved at Step 3 of the over-all solution procedure 
is described in Section 3.4. In this appendix a method of solution of this 
program, currently in use at BISRA, is given. 

The program has the following form: 

minimize to 
(rf/) 

subject to the constraints 


v -M 

•S- 

II 

for each f, 

(i) 

O^^or+f IdjrW* 

1 j 

r== 1, 2, R, 

(2) 

w Z op +22 djffdji. 

p = R + 1, . , . , P, 

(3) 


l i 


where the optimal solution tv is the current best lower bound on Z. Since 
one variable dp and one variable only can take the value of unity for each * 
(the remainder must be zero), the constraints (2) and (3) are rewritten 

(4) 

i*i j 

W S: 1 1typty, P = R + 1 P, (5) 

i»i i 

where the coefficients of dj 1 for * > 1 are all non-negative. 

1. The method of solution adopted is a tree search method in which 
choice is progressively made of particular variables d / as unity for i = 1, i = 2, 
and so on, only one dp being unity for any /. If j = 1, 2, . . . , m , the problem of 
choosing dj 1 is one of choosing from ??i possible alternatives or branches; for 
any one of these there are ?n branches in choosing d, f 4 * 1 and so on, thus forming 
a tree. The tree search method systematically considers all combinations 
of branches to arrive at an optimum solution. 

2. In addition to the constraints given above, the tree search method 
uses one more item of information. If the best solution to a linear program 
so far has a current upper bound on the global optimum solution Z of Z', 
interest is subsequently confined to a choice of d p for each i such that 

tvo=Z'^>ZZ*JpW for /> — P + 1 , . . . , P. 
i-i j 

In practice, the parameter rro in this constraint is successively replaced by 
lower values, solutions of the integer program, as these arc deduced in the 
tree search. The constraint enables the search for an optimal solution to be 
accelerated by cutting off uninteresting branches of the tree. 



34 


C. K. C. METZ, R. K. HOWARD AND J. M. WILLIAMSON 

3. The tree search method proceeds as follows. One dj { is chosen to be 
unity for each i in turn ; / — 1, i = 2, . . . . At any stage i— /, dj 1 is chosen by 
solving the small integer program: 

minimize tv 

(d,n 

subject to the constraints 

0>2IV^ '=1,2,...,*, 

® > 2 p—R + 1, . , p, 

i=ii 

in which dj { for i < / are fixed. 

4. There are three possible outcomes at this stage: 

(a) This small program may be infeasible. 

(b) This small program may be feasible but may have an optimal 
solution re > zcq . 

(c) This small program may be feasible with an optimal solution 
zo < wo. 

5. If either (a) or (b) of paragraph 4 obtains, any further choice of 
d} ( , i > /, will have the same property (or be infeasible if previously feasible), 
since dy r f >0 and dj P { '>0 for all r and all p. Thus all remaining branches 
of the tree from this point on can be ignored. The search then goes back to 
the preceding branching point in the tree, where a similar small integer pro- 
gram is solved, to choose a dj* 1 not yet investigated at this point in the tree. 

6. If (c) of paragraph 4 obtains, another small integer program is solved 
for dj~ l unless /=??. If /=?z, tv o is replaced by 

max ( 2 Z d Sp id A’ 
p \J-ii / 

and the search goes back to the preceding branching point in the tree, where 
a similar small integer program is solved, to choose a dj~ l not yet investigated 
at this point in the tree. 

7. The procedure terminates when all possible branches have, in effect, 
been considered. At this stage the current value of zvq is the least lower 
bound on Z and the corresponding values of dj*, i = 1, 2, . . . , n, define the 
subregion in which this least lower bound occurs. 


APPENDIX 2 

SUBOPTIMIZATION OF THE INTEGER PROGRAM 

In Appendix 1 a method of solving the integer program is given. However, 
it may prove to be more efficient to obtain a more promising solution to the 
integer program than the best previously selected, without ensuring its 



benders’ partitioning method in nonconvex programming 


35 


optimality. A solution is said to be more promising than a previous solution 
of the integer program if it satisfies the current (larger) set of constraints and 
has an associated lower bound on Z less than the best linear program solution 
obtained so far. Two procedures of obtaining suboptimal solutions of the 
integer program are considered in the following paragraphs. 

1. The first is to adopt the systematic procedure of Appendix 1 for a 
specified time t and to terminate after this time t if an improvement on the 
previous solution has been obtained. If such a solution has not been obtained 
by time /, the procedure terminates as soon as one is found. 

2. The second procedure suggested is quite different. Starting with a 
previous solution (A) individual sets of variables dj { , j = 1, 2, are altered 
successively to achieve or maintain feasibility and to reduce the (new) lower 
bound on Z . In this way a “ local optimum ” solution to the integer program 
is found — or the best solution by time t as in 2. However, there is no guaran- 
tee that a feasible solution or a more promising solution will always be 
achieved by this procedure. If it does not lead to a more promising solution, 
the method of Appendix 1 must be adopted. 

3. The low'er bounds obtained by suboptimizations of the integer pro- 
grams will not have the property of optimal solutions of the integer program; 
they will not, in general, be monotonic increasing functions of the number of 
integer programs solved, nor will they necessarily be lower bounds on the 
global optimum solution. 

4. In both 1 and 2 there is a particularly attractive feature, the possibility 
of terminating the search after a specified time t. This allows the possibility 
of a parameter being built into the computer program w'hich can be continu- 
ally adjusted in the interests of over-all efficiency. 


APPLICATION DE LA METHODE DE PARTAGE DE BENDERS 
A UN PROBLEME DE PRO GRAMMATION NON-CONVEXE 


RESUME 

On reformule un problemc dc programmation non-convcxe, d’un type qui 
surgit frequemment dans des etudes dc recherche operationnclle, sous la 
forme d’un programme a variables mixtes. A cet effet, des approximations 
par segments lineaires sont introduces pour representer les fonctions non- 
lineaircs, comme dans le cas dc la programmation par separation; des variables 
enticres prenant les valcurs 0-1 sont ensuitc introduces pour exprimer les 
contraintes qu’entrainc la non-convexite du programme. 

On a utilise ici la methode dc partage dc Benders en vue de resoudre ce 
problemc dc programmation a variables mixtes; la methode de base com- 
prend la resolution par iterations succcssivcs d’un programme lineaire, alter- 
nant avee la resolution d’un programme a variables toutes entieres. Lc 
programme lineaire correspond a la resolution du problemc d’origine dans 



S. TAMAKI 


36 

un domaine oonvexe partiel tandis que le programme a variables entieres 
correspond au choix du domaine convexe partiel qui doit etre examine en 
suite. Chaque paire de solutions fournit une borne superieure et une borne 
inferieure a la solution gdndrale optimale du probleme a variables mixtes. La 
procedure se termine lorsque les deux homes convergent a une valeur 
identique. 

Pour engendrer les contraintes du programme a variables entieres, on 
utilise une methode ameliores, qui garde sa forme characteristique. On donne 
aussi une methode de resolution par arborescence du programme a variables 
entires. 

En plus de la procedure de base, on ddcrit plusieurs ameliorations, qui 
sont de nature a accdlerer la convergence de la procedure vers une solution 
gdnerale optimale. 


SOME PROBLEMS IN THE USE OF LINEAR 
PROGRAMMING IN OPERATION PLANNING 

Quelques Problemes dans V Utilisation de la Programmation 
Lmeaire pour la Preparation des Opei'ations 

S. Tamaki 

Mitsubishi Oil Company, Tokyo , Japan 


1 . INTRODUCTION 

Linear programming has become one of the most fruitful techniques of 
operations research in the manufacturing and transportation industries. In 
operation planning in the oil industry linear programming provides a strong 
background for an optimum operation policy for management. The majority 
of linear programming applications in this industry, however, concern a 
program for selected parts of the total problem, in terms of locality and 
time period. 

This is more important in short-term planning, in which the carry-over 
of inventory and the splitting up of crude resources among time periods are 
the governing parameters in each stage. 

As the means of coping with them, an integrated planning system is 
needed. 

In the design of such a large-scale system multistage and multilevel con- 
cepts are of greatest significance; that is, the system is comprised of many 
subsystems and a suboptimizer for each substage, and such systems, stacked 



USE OF LINEAR PROGRAMMING IN OPERATION PLANNING 


37 


/\ 

/ \ 



by time periods, are governed by the over-all optimizer (Figure 1). The 
purpose of this article is to present mathematical and practical problems in 
the application of linear programming and then to describe in brief form an 
integrated production planning system for the oil industry. 

Theoretically, it depends on the decomposition principle and sensitivity 
analysis of linear programming. 


2. PROBLEMS IN THE USE OF LINEAR PROGRAMMING 
2*1* Mathematical Problems 
2,7./. Nonlinearity 

To employ linear programming assumptions must be made that will 
permit the approximate linear representation of actual operations. If we are 
faced with convex nonlinearity, such as the cost function for an octane booster 
for motor gasoline, segmental linear approximation is permissible. As for 
concave nonlinearity, such as the construction cost of a new plant, in which 
the unit cost decreases as the size of the plant increases, we are compelled to 
use separable programming by the upper bounded variable method. In a 
case calling for integer programming, such as tanker allocation for trans- 
porting crude oils, an efficient practical method cannot be expected. 

2. 1 .2. Uncertainty 

Mathematically, the treatment of uncertainty in terms of matrices or con- 
stants is not complete. Accordingly, we usually solve many possible alter- 
native cases and finally choose the optimum solution from the output, taking 
into consideration all preset objectives (safety, profitability, etc.). For this 










S. TAMAKI 


38 

purpose several techniques to facilitate computation, such as matrix genera- 
tion and matrix condensation, are being applied. 

2.13. Dynamic Programming 

The most difficult problem in linear programming concerns sequencing. 
If the sequencing of activities in a substage has no effect on over-all profit- 
ability, a multistage linear programming model may theoretically be the best 
method. However, in practice, the size of the multistage model tends to 
exceed the capacity of the computer, and a decomposition method is required. 

2.2. Practical Problems in Planning 

Almost all oil companies in Japan have the same problems because of their 
geographical location and operating conditions. They import crude oils from 
the Middle East and many other parts of the world, refine them at plants 
located in various districts, transport the products to oil depots scattered all 
over the country, and finally sell them to consumers. Therefore the following 
problems are given special emphasis in the decisions of each company. 

1. Crude oil selection. 

2. Tanker movement allotment. 

3. Processing and blending operations in refineries. 

4. Transportation of products to oil depots from refineries and the 
transfer of intermediate products between refineries. 

5. Product mix to be produced and sold. 

Today many oil companies in Japan are using linear programming models 
to solve these problems, and management is depending on the data from their 
solution. Because the most important objective of operation planning is the 
selection of crude oil slate, these linear programming models must cover the 
specific terms on which a contract to purchase crude oil is based. 

The main substance of the matrix lies in the blocks concerned with refinery 
operations and marketing, and they are linked by equations dealing with 
the following: 

1. Crude oil availability restrictions. 

2. Intermediate product availability restrictions. 

3. Final product availability restrictions. 

After computation of such a model and the selection of optimal crude 
slate, allotment of the crude oils is made to all substages monthly, weekly, or 
daily. Up to the present time decomposition and sequencing computations 
have been done by hand because fluctuations between substages are. some- 
what absorbed by inventories and it is hard to decompose the solution of a 
linear program into subprograms. However, the recent development of control 
systems and the advance in decomposition techniques have given us the key 
to this problem. 



USE OF LINEAR PROGRAMMING IN OPERATION PLANNING 


39 


Fluctuations and disturbances are considered to come from the following 
sources- Some can be estimated beforehand and can be included in the cal- 
culations, but it is impossible to predict others. 

1. Internal disturbances, resulting from upsets, down times, and the 
inability to maintain smooth control. 

2. Fluctuations in the amount and quality of the processed products. 

3. Fluctuations in inventor}* stocks that result from batch product ship- 
ments and deliveries. 

4. Fluctuations in quantities of crude oils that result from batch 
deliveries and undependable tanker arrival dates. 

5. Changes in external environment; for example, marketing trends 
and government policies. 

3. APPROACH TO A TOTAL PLANNING AND 
CONTROL SYSTEM 

3.1. General Philosophy 

Two features of the total system concept are of particular relevance in 
control system design: 

1. Partition of the over-all problem into more readily handled sub- 
problems. 

2. Division of planning and control efforts and responsibility on all 
operating levels. 

The system proposed in this report consists of three distinct levels. The 
uppermost is an over-all optimizer and the second is a group of substage 
optimizers; both take the form of linear programming models. The third level 
consists of subsystems and is a kind of simulator. The purpose of the system 
is to minimize (or maximize) the objective function of the over-all optimizer 
while satisfying the lower level subsystems. General functions of each sub- 
system are identified as follows: 

1. The over- all optimizer sets the objective of the total system; that is, 
it evaluates the plans of subsystems from the viewpoint of total cost or profit. 

2. Each second level suboptimizer is representative of one substage. The 
purpose of each is to generate a suboptimal plan based on data provided by 
the over-all optimizer and to supply information to lower subsystems con- 
cerning the economic effect associated with the parameters of subsystems. 

3. At the lowest level each subsystem describes divisional operations in 
accordance with input data and generates an environment for the relevant 
suboptimizer. Most of these subsystems have parameters that maize connec- 
tion with the suboptimizer. 

3.2. Structure and Function of Optimizers 

Over- all and substage optimizers expressed mathematically are the 
following: find 

(/ = 1, 2, 3; j ~ K . . . , nr) 



40 


S. TAMAKI 

to minimize 


3 nl 

2 2cM, 

1=1 j = i 

subject to 

t 

3 nt 

2 2 a if x i l 2 < h 0 * 
=1 1 

(1) t=: 

for each t 

nt 

2 a i} tx i l ^ 

(2) (i=h + l, 

and all 




(3), 

where Xj* = the operation variable for the ith. substage, 
c/ = the cost coefficient for the variable x/, 

= 1, . .., A), the coefficient of over-all constraints, 

ay 1 = (i = h + 1, . . . , ;;//), the coefficient of constraints for the /th 
substage. 

Segregation of the over-all and suboptimizers is given later. In this case, in 
particular, over-all constraints are crude availability for the over-all time 
period and the definition of inventories between adjacent time-stages. The 
cost factors Cj t and the matrix elements of each substage atf are the same 
type, except for internal disturbance; therefore we denote c/, atf (; = /; + 1, 
..., mi), nt, and vit as Cj , flf/, and m. According to the principle of de- 
composition given by Dantzig and Wolfe [2], this problem is solved by the 
following procedure. 

1. Solve each subprogram and make several feasible solutions, ignoring 
over- all constraints. We assume each subprogram is bounded. Since all 
matrices for the substages are the same type, after achieving a basic matrix 
for one stage, only simple successive iterations are required for the others. 

2. For each solution of a subprogram calculate the amounts of crude oil 
used, based on inventories at the beginning and end of the substage. Represent 
each of these solutions as a variable with an upper bound of 1. Its coefficient 
is one of the above quantities; these variables represent the effect of each 
solution on the over-all constraints. Solve the ‘‘restricted master program” 
formed by these calculations and determine the simplex multipliers for each 
constraint: 


(i=l, 2, .... A), 
Pi, (*=1,2,3). 



USE OF LINEAR PROGRAMMING IN OPERATION PLANNING 41 

3. Solve the subprograms by using multipliers. For each /th substage 
(/ — 1, 2, 3) minimize 

2 t jpaifixf +pt , (1=1.2,..., A), 

i “l 

subject to 

n 

2 < V, (/= n + 1, . . . , m). 

These subprograms correspond to the suboptimizers of the system. 

4. Minimum relative cost factor: 

min(7 Tpaipxf +pt ). 

If the minimum relative cost factor is non-negative, the solution is optimal. 
If it is not, introduce a new vector into the master program, corresponding 
to an xf that yields the minimum relative cost factor (2). 

3.3. Structure and Functions of Subsystems 
Structure of subsystems 

Subsystems are simulators, programmed to generate matrices and the 
right-hand sides of optimizers. They are identified by the following 
two types: 

1. Subsystems generate matrices and right-hand sides of optimizers 
according to input data and are independent of the computational results of 
optimizer; for example, the transportation matrix generator (subsystem D 
in Figure 2) generates transportation cost matrices only, and there is no feed- 
back function to the optimizer and other subsystems. 

2. If, after generating matrices and right-hand sides, the result of 
optimization shows that there are more profitable right-hand sides, the sub- 
system of this type iterates generation of modified right-hand sides in the 
appropriate direction. 

Improvement function of subsystems 

After generating matrices and right-hand sides at the first iteration 
optimizers compute the first original plan. 

Because there may be some room for improvement in the allotment of in- 
puts to subsystems, and disturbances during operation, their economic evaluation 
and possible improvement in planning arc required. By obtaining simplex 
multipliers of subconstraints corresponding to each suboptimizer, economical 
evaluation of the allotment is provided; and if there are any differences 
exceeding the given limit between the corresponding multipliers, possible 
changes in allotted right-hand sides to each substage arc applied; for example, 
if (max rr< f “ min 7 rd'! ;> Sf , is replaced by (Ad ~f AA*) for the /'th stage 

of max rrp and bf is replaced by (Ad — AA,*) for the /'th stage of min tt/, 



Over-all 














USE OF LINEAR PROGRAMMING IN OPERATION PLANNING 43 

where 8 f is the given allowance for the difference of m*. A hi is a small dis- 
placement of RHS for the ith constraint between substages, determined from 
the point of actual operation. Because 7 t<* is considered as Ac*/A&i , by this 
displacement Ac = — Ac *' = (max — min tt/) • Ab{ may be expected. 

Example of improvement 

Let us describe the function of a crude tanker allotment subsystem. This 
subsystem does not operate in the first run of the optimizer. After the first 
computation, which determines the optimal crude oils for substages, the crude 
allotment of each tanker is made from outside and then put into the optimizer. 
(A crude tanker fleet chartered by the company makes a turn-around between 
loading ports and refineries.) Though this actual allotment is made from the 
optimal plan, there might be some discrepancy between actual allotted and 
optimal crude availability. Improvement in the plan is made by changing the 
tanker allotment or by changing the sequencing order of tanker arrivals. It 
must be noted that this improvement cannot always be expected, for if dis- 
placement A8f brings about a change in basis, the relation of values of rri 1 
must be changed. Therefore a modification rule, that is, how to determine 
the magnitude of 8( and A bi , should be given from the viewpoint of actual 
operation. After the modification, the over-all optimization procedure is 
reapplied. 


4. CONCLUSION 

Operation planning and control concepts must be developed into a real-time 
computer system. There may be some instances of the application of such a 
system to total refinery' control, but in very few cases has such a system been 
applied to a total company control, as I have proposed here. This system 
approach is based on the valuable suggestions made by Mr. S. Kasuga, 
manager of the Statistics and Computer Department, Mitsubishi Oil Com- 
pany, to whom I express my gratitude. 


5. REFERENCES 

[1] G, B. Dantzic, Linear Programming and Extensions, Princeton University n rcss, 
Princeton, N.J., 1963. 

[2] G, B, Dantzic 2 nd Philip Wolfe, “ Decomposition Principle for Linear Pm^runv, 
J. Operations Res., 8, No. 1, 101-111 (1960). 

[3] M. D. Mmnovic, I. Lefkov/itz, and J. D. Pearson, “Advance in Mulsilev*# 
Control," Proc . IFAC Congr Tokyo, 1965. 

M A. R, Catchpole, “The Application of Linear Programming to Integrate* Supply 
Problems in the Oil Industry," Operations Res, Quart,, 13, No, 2, 161-T t (June 
1962). 

M K* Matsumoto, “Applications and Problems of Linear Programming. Char . . 
Erg. (Japan), 28, No. 30, 895-899 (October 1965). 



44 


IGOR A. USHAKOV 


QUELQUES PROBLE3MES DANS L’UTILISATION 
DE LA PROGRAMMATION LINE AIRE POUR 
LA PREPARATION DES OPERATIONS 

RfiSUMfi 

Bien que la programmation lineaire ait deja ete entierement utilisee dans les 
industries petrolieres du monde a fin de fournir la ligne de conduite optimale, 
plusieurs problemes importants ne sont pas encore resolus non seulement au 
point de vue mathematique mais au point de vue pratique. Parmi ces prob- 
lemes, les deux points les plus importants consistent a exploiter la solution 
optimale pour le caractere dynamique de Poperation actuelle liee avec le 
facteur temps, et a appliquer une fonction de controle contre des perturbations 
interieures et exterieures. 

Pour aborder ces problemes, Pauteur propose, dans cet article, un systeme 
a grande echelle avec les concepts a multi-etage et a multi-but. 


APPROXIMATE ALGORITHM FOR CONSTRUCTION 
OF OPTIMAL RELIABLE SYSTEM WITH 
ARBITRARY STRUCTURE 

Algorithm Approche pour la Construction (Tun Systeme 
Optimal stir Ayant une Structure Arbitraire 

Igor A. Ushakov 

State Committee of Radioelectronics , 

Moscoto, U.S.S.R . 


Problems concerning the optimal allocation of resources appear if we want 
to develop complex equipment with a high effectiveness index of reliability. 
The quality of functioning complex systems is evaluated by indices that de- 
pend on the type of system, on its purposes, and on the criteria of a 
satisfactory functioning system. We usually have some restrictions; for ex- 
ample, the system cost may be a restricted resource for one type of system 
and the system weight may be for another. Let us denote any restricted 
resource by the name u weight,” independent of its actual nature. 

A system consisting of n elements is considered. It is assumed that the ith 




APPROXIMATE ALGORITHM FOR OPTIMAL RELIABLE SYSTEM 


45 


clement may be in only one of two states: the state of proper working order 
(5f=l) and the state of failure (S{ — 0). Then at any arbitrary fixed 
moment the over-all system is in one of 2 n different states (5 — 5j , 
5 2 , •••, Sn), where St is 0 or 1. 

The general method of evaluation of the effectiveness index of a complex 
system, in cases in which its elements are subject to failure, is given in [1]: 

2 Hs<j>s- ( 1 ) 

s 

Here is the probability that the system is in the 5th state and <ps is a system 
conditional effectiveness index characteristic of the 5th state. The summation 
is over all the 2 n values of the symbol 5. 

We can easily calculate the probability Hs if all elements are assumed 
independent: 

H s = fl r,*( (2) 
f-i 

Here rt is the probability r that at the selected arbitrary fixed moment of time 
the ith element is in the working state. 

The choice of an index <f>s can provide us with a system effectiveness index 
F in a form that we want. Naturally the indices (f>s do not depend on the 
values r*. 

Let us consider a system for which different variants can be employed 
in building the various elements; for example, the *th element can be made 

in the variant forms z'i , ;* 2 , . tj , The j th variant of the ith element is 

characterized by two indices: the first is a reliability r*(i)) and the second is a 
weight In principle, the number of different variants may be infinite. 

Thus redundancy leads to an infinite countable set of the variants. 

We can improve the effectiveness index of the system in several essentially 
different ways: one way is by a change in the system structure or principles 
of its functioning; a second way is by improvement of the element reliability 
without any system structure change. The second way is the simpler one and 
in most cases the only one during the later stages of system design. In this 
case it is interesting to consider the task of the optimal allocation of the full 
system weight between its elements to maximize the system effectiveness 
index. 

First, let us define how the effectiveness index F depends on the reliability 
of each system clement. The set of the system states can be divided into two 
subsets: the first includes all states such that 5f — 1 for some fixed ith element 
and the second includes all states such that 5{ = 0 for the same clement. 
Then using the expression (2) we have from (1) 

F = // o)] “f* o- (3) 

Here S* is a system state without respect to the ith clement; {5*, 0} and 
{5*, 1} arc the system states characterized by 5* and St — 0 or 5j=l, 
respectively. 



46 


IGOR A. USHAKOV 


Thus from (3) we can conclude that the effectiveness index F is 2 linear 
function of ti . 

The next preliminary step of the investigation is to study the dependence 
of the element reliability on its weight. All variants of the it h element design 
may be arranged in the set with their corresponding values of n(^). 

We call a variant expedient if the following condition is satisfied: 

> nfe) if ££&+*) >gi&)- 

In other words, for the expedient variants the greater the weight of the 
element variant, the more reliability achieved. If this condition is not satisfied, 
the variant is called nonexpedient. We shall consider only expedient variants. 
Under certain conditions the function T\{g) for expedient variants is strictly 
convex; for example, this will arise when we use redundancy for improvement 
of the element reliability. If for some j and k the set of expedient variants 
is characterized by the following inequality, 

~ T i(h) > r i(i}^m) — rijij) 

gl(V~t) —gifo) ''gifa-m) -gifo) 

for all OCttcC/' [i.e., rtfg) has local concavities], we can expel points i ^ 
for practical evaluations. In this case the function n(g) is the convex broken 
line “drawn on” the set of the points ri(g). 

With this assumption a new function F is a strictly convex function for all 
arguments g{. 

Thus we can pose two optimal task 

(a) to find 

Sup F[g ( (ij); * = 1, 2, j= 1,2,.. .] (4) 

n 

gi- 2 gi<gv, 

(b) to find 

Inf£jV f (z»; i=l,2, . ..,n:j= 1,2, ...] (5) 

n: F(r { 1=1,2, ...,n)^F 0 . 

Here go and Aq represent the maximum permissible weight and minimum 
required effectiveness index of the system, respectively. 

These two task are typical of convex programming with linear restric- 
tions, Therefore we can use the steepest ascent method to solve such 
problems. We suggest the following practical algorithm for the optimal 
procedure: 

First, the system variant with the minimum index F^ 0} is chosen. This 
variant has, with our assumption, the minimum v/eight 

n 

g {0> — 2 gifo). 



APPROXIMATE ALGORITHM FOR OPTIMAL RELIABLE SYSTEM 


47 


Then we must find the element on which reliability improvement has 
the greatest impact for maximization of the system effectiveness index. At 
the first step of the optimal process we calculate values 


/rti) p( 0) 

yi n = £<» _ £«>) ’ *=1,2, 


and 


ylt * = max y ( , 1) , *=1,2 

l<i<n 


is found. The variant Ao of that element is replaced by the variant Ai . We 
assume that the initial state before the second step of the optimal process is 
characterized by FM — and The same process is continued; 

that is, we obtain the values 


( 2 ) 


F[ 2) — Ft i) 


( 6 ) 


for all *=1,2, n, and so on. 

By the A 7 th step of the process the system consists of the element variants: 
l/ t ( 2 Y ) , 2^ t (A f ), . .., n j n (N ) and its weight is equal to 

£ ( - V) = Z £«(%>’>)• (7) 

f-1 


It is clear that A 7 = 2?«i /<(A 7 ). 

We calculate the values of by use of the expressions (3) and (7): 


rvj.li r f[*/»GY)]v i \tt 

r )=== 777 1 — Tr? — 7 — 


(3) 


If the reliability* of the system elements is close to unity (i.e., 1 — r* 1 jn 
for all i= 1, 2, . .., w), we can obtain the approximate expression 




y 0 )(1— rjt). 

I~1 * 


( 9 ) 


Here is the conditional system effectiveness index under the condition 
that all elements are working; that is, 5j.= l, . .., Sn^l, $E+ % s k ~o is the 
conditional system effectiveness index under the condition that all elements 
arc working with the exception of the Ath; that is, 5*1 = 1, 52 = 1,..., 
*$1-1=1, Si — 0, 6* +1=1, 5 n = l. To solve this optimal problem we 

suggest the following practical method: 

1. We describe all variants of design for the ith element (i*= 1, 2, n) 
*i> *2* -*•» t), ... • 

2. Wc calculate the indices of reliability and weight r<(?)) and gi(tj), 
respectively, for all variants of the element. 



48 


IGOR A. USHAKOV 


3. We construct the following table in which the expedient variants arc 
arranged in order: 


Number Element 1 Element 2 ... Element n 

of 


variant 

n(l/) 

l?id/) 

M2,) 

£2(2/) 

... 

M n i) gn(tij) 

0 

n(lo) 

*ido) 

M 2o) 

£2(2o) 

... 

rn(tfo) gn(no) 

1 

ndi) 

*idi> 

M 2i) 

£2(2i) 

. . . 

rn(ni) g n {m) 

2 

nd*) 

g id 2 ) 

r 2 (2 2 ) 

£2(2 2 ) 

... 

Mm) g„(n 2 ) 

4. The gradients y ( i \ y% \ , 

, . . , for the first step are 

calculated : 


y\ ~ 


Jt) t 

y<i ‘ 


rx(li) — ^(1 q) ” 
gl(U) —gl(h) h = 2 
^2(2.1) — r 2 ( 2 o) " 
gs(gi) -gz(2o) *=1 

k*2 


{j>E — <f>E*, ^-o)[l — rj:{ko)], 
(<f>E — S= o)[l — r f: (k 0 )] y 


and so on. 

5. We find the maximum gradient; for example, if it is y^\ the variant 
lo of the element is replaced by the variant li , 

6. We calculate the gradients yf\ , for the second step. 

7. We determine the maximum of the obtained gradients. Then we 
replace the last variant of the respective element by the next one from the 
table* 


The values of the system effectiveness index F and system weight g are 
calculated during every step of the process. The process continues until 
conditions (4) and (5) are satisfied. 


CONCLUSION 

We note that similar optimal problems of reliability theory were solved in 
several papers (e.g., [3-7]) for less general systems (usually for simple redun- 
dancy) but with several restrictions. Most of these papers give the exact 
algorithms. The problem considered here is concerned with a more general 
type of system and a more general type of effectiveness index, but the 
algorithm is approximate. 


REFERENCES 

[1] J. A. Ushakov, Evaluation of Effectiveness of Complex Systems, in the collection 
of papers “Reliability of Radioelcctronics Devices.” “Sovetskoe Radio,” Moscow, 
1960. 

[2] I. A. Ushakov, “Approximate Algorithm for Constructing of Optimal Reliable 
System with Arbitrary Structure,” Tekhn. Kibernetika , No. 2 (1965). 



advances in techniques of mathematical programming 49 

[3] R. Bellman and S. Dreyfus, Dynamic Programming and Reliability of Multi- 
component Devices, Operations Res., 6, No. 2 (1958). 

[4] F. Black and G. Proshan, “On Optimal Redundancy/' Operations Res., 7, No. 5 
(1959). 

[5] J. Kettele, " Least Cost Allocation of Reliability Investment/* Operations Res., 
10, No. 2 (1962) 

[6] F. Mosckowitz and J. McLean, “Some Reliability Aspects of System Design,’* 
Trans. IRE , PQRC-9, No. 8 (1956). 

[7] Masafume Sasaki, “A Simplified Method of Obtaining Highest System Reliability/* 
Proc. 8th Nat. Sy*mp. Rehab. Qual. Control , 1962. 


ALGORITHME APPRO CHE POUR LA 
CONSTRUCTION D’UN SYSTEME OPTIMAL SUR 
AYANT UNE STRUCTURE ARBITRAIRE 

RESUME 

On definit un indicc de qualite dc fonctionnement (indicc d’efficacite) pour dcs 
systemes ayant une structure arbitraire, On propose un algorithme approxi- 
matif pour la repartition optimale du “poids” du systeme (ou d’un autre 
facteur contraignant) entre ses elements, en vue de la maximisation de Pindice 
d’efficacite. 

L’algorithme est base sur la methode de la plus grande pente, qui cst 
utilisee dans les problemes de programmation convexe avec contraintes 
lineaires. 


Primal-Dual Decomposition Programming 

Programmation dc Decomposition 
Primalc-duale 

Earu J. Bell 

The Amos Tuck School of Business Administration , 
Dartmouth College t Hanover , Neve Hampshire 
United States of America 


The primal-dual method of Dantzig, Ford, and Fulkerson is superimposed 
on the decomposition principle of Dantzig and Wolfe in order to solve large- 
scale linear programs whose matrices have the required block-angular 
structure* 

Preliminary results indicate that the present method is more efficient than 
the usual "primal” method which uses two phases of the simplex algorithm 
and linear subprograms. By contrast our method uses only one phase and 
subprograms that arc nonlinear because the function to be maximized involves 



50 


ABSTRACTS 


a quotient of two linear functions. Such programs are called hyperbolic, or 
linear-fractional, and can be solved by equivalent linear programs within the 
framework of the primal-dual method. 

Finally, it appears that the method is a composite algorithm whose mixed- 
pricing parameter is varied during the course of the calculation rather than the 
usual arbitrary way. The choice of the initial dual solution can cause the 
method to be more, or less, efficient and therefore each problem type merits 
some prior analysis. 


La methode primale-duale de Dantzig, Ford, et Fulkerson est superpose sur 
le principe de decomposition de Dantzig-Wolfe pour resoudre les programmes 
lineaires de grande taille dont la matrice a la structure “ block-angulaire”. 

Des resultats preliminaires indiquent que la methode presente est plus 
efficace que la methode “primale” qui, d’habitude, emploie deux phases de 
l’algorithme simplex et subprogrammes lineaires. Notre methode, par contre, 
n’emploie qu’une seul phase en moyennant de subprogrammes qui sont non- 
lineaires parce que la fonction a maximiser est un quotient de deux fonctions 
lineaires. Ces programmes dites hyperboliques peuvent etre resolus par pro- 
grammes lineaires dan les circonstances dictees par la methode primale-duale. 

Enfin, il parait que la methode n’est qu’un algorithme compose dont le 
choix du parametre u mixing’ * est determine au cours du calcul plutot que 
d’apres la fa^on arbitraire habituelle. Le choix de la premiere solution duale 
peut rendre la methode plus, ou moins, efficace et, done, chaque type de 
probleme merit de Tanalyse anterieure. 


On an Algorithm for Nonlinear Fractional Programming! 

Sur un Algorithme pour la Progr animation Non-Lineaire Fractionnaire 

W. Dinkelbach 

Industrieseminar der Universitdt zu Koln 
West Germany 


All problems of mathematical programming not belonging to linear program- 
ming are called problems of nonlinear programming or nonlinear programs. 
There are no general algorithms for nonlinear programs without special assump- 
tions concerning the objective function and the corresponding constraints. This 
paper deals with a special class of nonlinear programming problems that have a 
fractional objective function; that is, the objective function consists of a fraction 
with nonlinear as well as linear terms in the numerator and denominator. This 
kind of problem is referred to as fractional programming. If numerator, as 

t The full paper has been published in Management Science 13 : 7, 1967, under the 
title “On Nonlinear Fractional Programming.” 



ADVANCES IN TECHNIQUES OF MATHEMATICAL PROGRAMMING 


51 


well as denominator, is a linear function and the set of feasible solutions is 
polyhedral, the problems in question are called linear fractional programs. The 
purpose of this paper is to delineate an algorithm for fractional programming 
problems with nonlinear terms in the numerator and denominator. This 
algorithm requires some special assumptions; for instance the concavity of the 
numerator and the convexity of the denominator for solving a maximum prob- 
lem. The algorithm presented, based on a theorem by Jagannathan, concerns 
the relationship between fractional and parametric programming. This theorem 
is restated. Finally, a simple numerical example is outlined. 


Tous les problemes de la programmation mathematique ne concemant pas les 
questions de la programmation lineaire sont denommes problemes de la pro- 
grammation non-lineaire ou programmes non-lineaires. II n’y a aucun algorithme 
general pour les programmes non-lineaires sans presomptions speciales quant 
a la fonction economique et aux restrictions y relatives. Aux pages suivantes 
unc question speciale de la programmation non-lineaire est traitee: la fonction 
economique a maximer est egale a une fraction se composant de termes lineaires 
et non-lineaires non seulement dans le numerateur mais aussi dans le denomi- 
nateur. Un tel probleme s’appele “programmation fractionnaire Si le 
numerateur ainsi que le denominateur sont representes par des fonctions 
lineaires et si l’espace des activites est polyedrique les problemes en question 
sont denommes “programmation fractionnaire lineaire”. Ce traite se propose 
de dcvelopper un algorithme pour les problemes de la programmation frac- 
tionnairc avec des termes non-lineaires au numerateur et au denominateur. 
Cet algorithme necessitc quelques presomptions speciales: pour resoudre un 
probleme de maximum, par exemple, la concavite du numerateur et la convexitc 
du denominateur sont demandees. L’algorithme expose se base sur un theoreme 
de Jagannathan rclatif aux relations entre la programmation fractionnaire et 
parametrique. Ce theoreme est formule a nouveau. Le tout est suivi d*un 
bref exemple numerique. 



SESSION II 


PROGRESS IN TECHNIQUES 
OF DECISION THEORY 


Pr ogres dans les Techniques 
de Theorie de Decision 


Chairman: H. Raiffa (United States of America) 




DECISION ANALYSIS: APPLIED DECISION THEORY 

Analyse des Decisions: Theorie Appliquee 
des Dicisions 

Ronald A. Howard 

Institute in Engineering-Economic Systems 
Stanford University , California 
United States of America 


1. INTRODUCTION 

Decision theory in the modem sense has existed for more than a decade. Most 
of the effort among the present developers of the theory has been devoted to 
Bayesian analysis of problems formerly treated by classical statistics. Many 
practical management decision problems, however, can be handled by formal 
structures that are far from novel theoretically. The world of top management 
decision making is not often structured by simple Bernoulli, Poisson, or normal 
models* 

Indeed, Bayes’s theorem itself may not be so important. A statistician for 
a major company wrote a report in which he commented that for all the talk 
about the Bayesian revolution he did not know of a single application in the 
company in which Bayes’s theorem was actually used. The observation was 
probably quite correct — but what it shows by implication is that the most sig- 
nificant part of the revolution is not Bayes’s theorem or conjugate distributions 
but rather the concept of probability as a state of mind, a 200-year-old concept. 
Thus the real promise of decision theory lies in its ability to provide a broad 
logical basis for decision making in the face of uncertainty rather than in any 
specific models. 

The purpose of this article is to outline a formal procedure for the analysis 
of decision problems, a procedure that I call “decision analysis.” We shall also 
discuss several of the practical problems that arise when we attempt to apply 
the decision analysis formalism. 

2* DECISION ANALYSIS 

To describe decision analysis it is first necessary to define a decision. A decision 
is an irrevocable allocation of resources, irrevocable in the sense that it is im- 
possible or extremely costly to change back to the situation that existed before 
making the decision. Thus for our purposes a decision is not a mental commit- 
ment to follow a course of action but rather the actual pursuit of that course of 
action. This definition often serves to identify the real decision maker within a 
loosely structured organization. Finding the exact nature of the decision to be 

55 



56 


RONALD A. HOWARD 


made, however, and who will make it, remains one of the fundamental problems 
of the decision analyst- 

Having defined a decision, let us clarify the concept by drawing a necessary 
distinction between a good decision and a good outcome. A good decision is a 
logical decision — one based on the uncertainties, values, and preferences of the 
decision maker. A good outcome is one that is profitable or otherwise highly 
valued. In short, a good outcome is one that we wish would happen. Hopefully, 
by making good decisions in all the situations that face us we shall ensure as 
high a percentage as possible of good outcomes. We may be disappointed to 
find that a good decision has produced a bad outcome or dismayed to learn 
that someone who has made what we consider to be a bad decision has enjoyed 
a good outcome. Yet, pending the invention of the true clairvoyant, we find no 
better alternative in the pursuit of good outcomes than to make good decisions. 

Decision analysis is a logical procedure for the balancing of the factors that 
influence a decision. The procedure incorporates uncertainties, values, and 
preferences in a basic structure that models the decision. Typically, it includes 
technical, marketing, competitive, and environmental factors. The essence of 
the procedure is the construction of a structural model of the decision in a 
form suitable for computation and manipulation ; the realization of this model 
is often a set of computer programs. 

2.1. The Decision Analysis Procedure 

Table 1 lists the three phases of a decision analysis that are worth distinction: 
the deterministic, probabilistic, and post-mortem phases. 

Table 1 

The Decision Analysis Procedure 


I. Deterministic phase 

1. Define the decision 

2. Identify the alternatives 

3. Assign values to outcomes 

4. Select state variables 

5. Establish relationship at state variables 

6. Specify time preference 

Analysis: (a) Determine dominance to eliminate alternatives 

(b) Measure sensitivity to identify crucial state variables 

II. Probabilistic phase 

1. Encode uncertainty on crucial state variables 
Analysis: Develop profit lottery 

2. Encode risk preference 
Analysis: Select best alternative 

III. Post-mortem phase 

Analysis: (a) Determine value of eliminating uncertainty in crucial state 
variables 

(b) Develop most economical information -gathering program 



DECISION analysis: applied decision theory 


57 


2.LL The Deterministic Phase 

The first step in the deterministic phase is to answer the question, “What 
decision must be made?” Strange as it may seem, many people with what 
appear to be decision problems have never asked themselves that question. 
We must distinguish between situations in which there is a decision to be made 
and situations in which we are simply worried about a bad outcome. If we have 
resources to allocate, we have a decision problem, but if we are only hand 
wringing about circumstances beyond our control no formal analysis will help. 
The difference is that between selecting a surgeon to operate on a member of 
your family and waiting for the result of the operation. We may be in a state of 
anguish throughout, but decision analysis can help only with the first question. 

The next step is to identify the alternatives that are available, to answer the 
question, “ What courses of action are open to us? ” Alternative generation is the 
most creative part of the decision analysis procedure. Often the introduction 
of a new alternative eliminates the need for further formal analysis. Although 
the synthesis of new alternatives necessarily does not fall within the province of 
the decision analysis procedure, the procedure does evaluate alternatives and 
thereby suggests the defects in present alternatives that new alternatives might 
remedy. Thus the existence of an analytic procedure is the first step toward 
synthesis. 

Wc continue the deterministic phase by assigning values to the various 
outcomes that might be produced by each alternative. We thus answer the 
question, “ How arc you going to determine which outcomes are good and 
which arc bad? ” In business problems this wall typically be a measure of profit. 
Military and governmental applications should also consider profit, measured 
perhaps with more difficulty, because these decision makers are also allocating 
the economic resources of the nation. Even when wc agree on the measure of 
profit to be assigned to each outcome, it may be difficult to make the assignment 
until the values of a number of variables associated with each outcome are 
specified. Wc call these variables the state variables of the decision. Their 
selection is the next step in the deterministic phase. 

A typical problem will have state variables of many kinds: costs of manu- 
facture, prices charged by competitors, the failure rate of the product, etc. We 
select them by asking the question, “ If you had a crystal ball, what numerical 
questions would you ask it about the outcome in order to specify your profit 
measure?” At the same time that wc select these variables wc should assign 
both nominal values for them and the range over which they might vary' for 
future reference. 

Next we establish how the state variables arc related to each other and to 
the measure of performance. We construct, in essence, a profit function that 
shows how profit is related to the factors that underlie the decision. The con- 
struction of this profit function requires considerable judgment to avoid the twin 
difficulties of excessive complexity and unreal simplicity. 

If the results of the decision extend over a long time period, it wall be neces- 
sary to have the derision maker specify his time preference for profit. Wc must 



58 


RONALD A. HOWARD 


ask, “How docs profit received in the future compare in value to profit received 
today? ” or an equivalent question. In eases in which we can assume a perfect 
financial environment the present value of future profit at some rate of interest 
will be the answer. In many large decision problems, however, the nature of the 
undertaking has an effect on the basic financial structure of the enterprise. In 
these eases a much more realistic modeling of the time preference for profit 
is necessary. 

Now that we have completed the steps in the deterministic phase we have a 
deterministic model of the decision problem. We next perform two closely 
related analyses. We perform them by setting the state variables to their 
nominal values and then sweeping each through its range of values, individually 
and jointly, as judgment dictates. Throughout this process we observe which 
alternative would be best and how much value would be associated with each 
alternative. We often observe that regardless of the values the state variables 
take on in their ranges one alternative is always superior to another, a condition 
we describe by saying that the first alternative dominates the second. The 
principle of dominance may often permit a major reduction in the number of 
alternatives that need be considered. 

As a result of this procedure we have performed a sensitivity analysis on 
the state variables. We know bow much n 10 percent change in one of the 
variables will affect profit, hence the optimum alternative. Similarly, we know 
how changes in state variables may interact to affect the decision. This sensi- 
tivity analysis shows us where uncertainty is important. We identify those state 
variables to which the outcome is sensitive as “ crucial ” state variables. Deter- 
mining how uncertainties in the crucial state variable influence the decision is 
the concern of the probabilistic phase of the decision analysis. 

2.1,2 . Probabilistic Phase 

The probabilistic phase begins by encoding uncertainties on each of the 
crucial state variables; that is, gathering priors on them. A subset of the crucial 
state variables will usually be independent — for these only a single probability 
distribution is necessary. The remainder will have to be treated by collecting 
conditional as well as marginal distributions. We have more to say on this 
process later. 

The next step is to find the uncertainty in profit for each alternative implied 
by the functional relationship of profit to the crucial state variables and the 
probability distriUnion on those crucial state variables for the alternative. 
We call this derived probability distribution of profit the profit lottery of the 
alternative. In a few cases the profit lottery can be derived analytically and in 
many by numerical analysis procedures. In any ease it may be approximated by 
a Monte Carlo simulation. Regardless of the procedure used, the result is a 
probability distribution on profit (or perhaps on discounted profit) for each of 
the alternatives that remain in the problem. 

Now we must consider how to choose between two alternatives with different 
profit lotteries. In one ease the choice is easy. Suppose that we plot the profit 
lottery for each alternative in complementary cumulative form; that is, plot the 



DECISION ANALYSIS: APPLIED DECISION THEORY 


59 




Figure 1. Stochastic dominance. 


probability of profit exceeding x for any given X. Suppose further, as shown 
in Figure 1, that the complementary cumulative for alternative Ao always lies 
above that for alternative A\, This means that for any number a: there is a 
higher probability of profit exceeding that number with alternative Ao than 
with alternative A\. In this case we would prefer alternative A<i to alternative 
A i , provided only that we liked more profit better than less profit. Wc describe 
this situation by saying that the profit from alternative Ao is stochastically 
greater than the profit from alternative A i or equivalently by saying that alter- 
native Ao stochastically dominates alternative A\. Stochastic dominance is a 
concept that appeals intuitively to management; it applies in a surprising 
number of cases. 




60 


RONALD A. HOWARD 


Figure 2, however, illustrates a case in which stochastic dominance dots not 
apply. When faced with a situation like this, we must either abandon formal 
methods and leave the selection of the best alternative to judgment or delve into 
the measurement of risk preference. If we choose to measure risk preference 
we begin the second step of the probabilistic phase. We must construct a 
utility function for the decision maker that will tell us whether or not, for 
example, he would prefer a certain 4 million dollars profit to equal chances of 
earning zero or 10 million dollars. Although these questions are quite foreign 
to management, they are being asked increasingly often with promising results. 
Of course, when risk preference is established in the form of a utility function, 
the best alternative is the one whose profit lottery has the highest utility. 

2.13 . Post-Mortem Phase 

The post-mortem phase of the procedure is composed entirely of analysis. 
This phase begins when the best alternative has been selected as the result of 
the probabilistic phase. Here we use the concepts of the clairvoyant lottery to 
establish a dollar value of eliminating uncertainty in each of the state variables 
individually and jointly. Being able to show the impact of uncertainties on 
profit is one of the most important features of decision analysis. It leads directly 
to the next step of the post-mortem, which is finding the most economical 
information-gathering program, if, in fact, it would be profitable to gather more 
information. The information-gathering program may be physical research, a 
marketing survey, or the hiring of a consultant. Perhaps in no other area of its 
operations is an enterprise in such need of substantiating analysis as it is in the 
justification of information-gathering programs. 

Of course, once the information-gathering scheme, if any, is completed, its 
information modifies the probability' distributions on the crucial state variables 
and consequently affects the decision. Indeed, if the information-gathering 
program were not expected to modify the probability distributions on the 
crucial state variables it would not be conducted. We then repeat the proba- 
bilistic phase by using the new probability' distributions to find the profit lotteries 
and then enter the post-mortem phase once more to determine whether further 
information gathering is worthwhile. Thus the decision analysis is a vital 
structure that lets us compare at any time the values of such alternatives as 
acting, postponing action and buying information, or refusing to consider the 
problem further. 'Ve must remember that the analysis is always based on 
the current state o: knowledge. Overnight there can arrive a piece of infor- 
mation that changes the nature of the conclusions entirely. Of course, having 
captured the basic structure of the problem, we are in an excellent position to 
incorporate any such information- 

Finally, as the result of the analysis the decision maker embarks on a course 
of action. At this point he may be interested in the behavior of several of the 
state variables for planning purposes; for example, haring decided to introduce 
a new product, he may want to examine the probability' distributions for its 
sales in future years to make subsidiary decisions on distribution facilities or 



DECISION ANALYSIS: APPLIED DECISION THEORY 61 

on the size of the sales force. The decision-analysis model readily provides 
such planning information. 

2.2. The Advantages of Decision Analysis 

Decision analysis has many advantages, of which we have described just 
a few, such as its comprehensiveness and vitality as a model of the decision and 
its ability to place a dollar value on uncertainty. We should point out further 
that the procedure is relevant to both one of a kind and repetitive decisions. 
Decision analysis offers the operations research profession the opportunity to 
extend its scope beyond its traditional primary concern with repetitively 
verifiable operations. 

One of the most important advantages of decision analysis lies in the way it 
encourages meaningful communication among the members of the enterprise 
because it provides a common language in which to discuss decision problems. 
Thus engineers and marketing planners with quite different jargons can appreci- 
ated one another’s contributions to a decision. Both can use the decision-analysis 
language to convey their feelings to management quickly and effectively, 

A phenomenon that seems to be the result of the decision-analysis language 
is the successive structuring of staff groups to provide reports that are useful 
in decision-analysis terms. Thus, if the decision problem being analyzed starts 
in an engineering group, that group ultimately seeks inputs from marketing, 
product planning, the legal staff, and so on, that are compatible with the proba- 
bilistic analysis. Soon these groups begin to think in probabilistic terms and to 
emphasize probabilistic thinking in their reports. The process seems irrever- 
sible in that once the staff of an organization becomes comfortable in dealing 
with probabilistic phenomena they are never again satisfied with deterministic 
or expected value approaches to problems. Thus the existence of decision- 
analysis concepts as a language for communication may be its most important 
advantage, 

2.3. The Hierarchy of Decision Analysis 

It is informative to place decision analysis in the hierarchy of techniques 
that have been developed to treat decision problems. We see that a decision 
analysis requires two supporting activities. One is a lower order activity that we 
call alternative evaluation; the second, a higher order activity that wc call goal 
setting. Performing a decision analysis requires evaluating alternatives according 
to the goals that have been set for the decision. The practitioners of operations 
research arc quite experienced in alternative evaluation in both industrial and 
military contexts. In fact, in spite of the lip service paid to objective functions, 
only rare operations researchers have had the scope necessary to consider the 
goal-setting problems. 

All mankind seems inexpert at goal setting, although it is the most important 
problem wc face. Perhaps the role of decision analysis is to allow the discussion 
of decisions to be carried on at a level that show's the explicit need for goals or 
criteria for selection of the best alternative. We need to make goals explicit only 



62 


RONALD A. HOWARD 


if the decision maker is going to delegate the making of the decision or if he is 
unsure of his ability to be consistent in selecting the best alternative. We shall 
not comment on whether there is a trend toward more or less delegation of 
decision making. However, it is becoming clear to those with decision-making 
responsibilities that the increasing complexity of the operations under their 
control requires correspondingly more formal approaches to the problem of 
organizing the information that bears on a decision if inconsistent decisions are 
to be avoided. 

The history of the analysis of the procurement of military weapons systems 
points this out. Recent years have shown the progression of procurement 
thinking from effectiveness to cost effectiveness. In this respect the military 
authorities have been able to catch up in their decision-making apparatus to 
what industry had been doing in its simpler problems for years. Other agencies 
of government arc now in the process of making the same transition. Now all 
must move on to the inclusion of uncertainty, to the establishment of goals that 
are reflected in risk and time preferences. 

These developments are now on the horizon and in some cases in sight; 
for example, although we have tended to think of the utility theory as an 
academic pursuit, one of our major companies was recently faced with the 
question, “Is 10 million dollars of profit sufficient to incur one chance in 1 mil- 
lion of losing 1 billion dollars? Although the loss is staggering, it is realistic 
for the company concerned. Should such a large company be risk-indifferent 
and make decisions on an expected value basis? Are stockholders responsible 
for diversifying their risk externally to the company or should the company be 
risk-averting on their behalf? For the first time the company faced these ques- 
tions in a formal way rather than deciding the particular question on its own 
merits and this we must regard as a step forward. 

Decision analysis has had its critics, of course. One said, “In the final 
analysis, aren’t decisions politically based? ” The best answer to that came from 
a high official in the executive branch of our government who said, “The better 
the logical basis for a decision, the more difficult it is for extraneous political 
factors to hold sway.” It may be discouraging in the short run to sec logic over- 
ridden by the tactical situation, but one must expect to lose battles to van 
the war. 


Another criticism is, “ If this is such a good idea, why haven’t I heard of it 
before? ” One very practical reason is that the operations we conduct in die 
course of a decision analysis would be expensive to carry out without using 
computers. To this extent decision analysis is a product of our technology. 
There are other answers, however. One is that the idea of probability as a state 
of mind and not of things is only now regaining its proper place in the world of 
thought. The opposing heresy lay heavy on the race for the better part of a 
century. We should note that most of the operations research performed in 
World War II required mathematical and probabilistic concepts that w r ere 



course of history. 



DECISION ANALYSIS: APPLIED DECISION THEORY 


63 


3. THE PRINCIPLES OF THE DECISION ANALYST 

Next we turn to the principles of the decision analyst, the professional who 
embarks on preparing a decision analysis. His first principle is to identify and 
isolate the components of the decision — the uncertainty, risk aversion, time 
preference, and problem structure. Often arguments over which is the best 
decision arise because the participants do not realize that they are arguing on 
different grounds. Thus it is possible for A to think that a certain alternative is 
riskier than it is in B's opinion, either because A assigns different probabilities 
to the outcomes than B but both are equally risk-averting, or because A and B 
assign the same probabilities to the outcomes but differ in their risk aversion. 
If wc arc to make progress in resolving the argument, we must identify the 
nature of the difficulty and bring it into the open. Similar clarifications may be 
made in the areas of time preference or in the measurement of the value of 
outcomes. 

One aid in reducing the problem to its fundamental components is restricting 
the vocabulary that can be used in discussing the problem. Thus we carry on 
the discussion in terms of events, random variables, probabilities, density functions, 
expectations, outcomes, and alternatives. We do not allow fuzzy thinking about 
the nature of these terms. Thus “The density function of the probability ” 
and “The confidence in the probability estimate” must be nipped in the bud. 
Wc speak of “assigning/ 1 not “estimating/* the probabilities of events and think 
of this assignment as based on our “state of information/* These conventions 
eliminate statements like the one recently made on a TV panel of doctors who 
were discussing the right of a patient to participate in decision making on his 
treatment. One doctor asserted that the patient should be told of “some kind 
of a chance of a likelihood of a bad result/* I am sure that the doctor was a 
victim of the pressures of the program and would agree with us that telling 
the patient the probability the doctor would assign to a bad result would be 
preferable. 

One principle that is vital to the decision analyst is professional detachment 
in selecting alternatives. The analyst must not become involved in the heated 
political controversies that often surround decisions except to reduce them to a 
common basis. He must demonstrate his willingness to change the recommended 
alternative in the face of new information if he is to earn the respect of all con- 
cerned. This professional detachment may, in fact, be the analyst’s single most 
valuable characteristic. Logic is often severely strained when we are personally 
involved. 

The detachment of the analyst has another positive benefit. As an observer 
he may be able to suggest alternatives that may have escaped those v'ho are 
intimately involved with the problem. He may suggest delaying action, buying 
insurance, or performing a test, depending on the nature of the decision. Of 
course, the comprehensive knowledge of the properties of the existing alternatives 
that the decision analyst must gain is a major aid in formulating new alternatives. 

Since it is a rare decision that does not imply other present and future 
decisions, the decision analyst must establish a scope for the analysis that is 



64 


RONALD A. HOWARD 


broad enough to provide meaningful answers but not broad enough to impose 
impractical computational requirements. Perhaps the fundamental question in 
establishing scope is how much to spend on decision analysis. Because the 
approach could be applied both to selecting a meal from a restaurant menu and 
to allocating the federal budget, the analyst needs some guidelines to determine 
when the analysis is worthwhile. 

The question of how much decision analysis is an economic problem sus- 
ceptible to a simpler decision analysis, but rather than pursue that road let us 
pose an arbitrary and reasonable but indefensible rule of thumb: spend at least 
1 percent of the resources to be allocated on the question of how they should he 
allocated. Thus, if we were going to buy a 2000-doIlar automobile, the rule 
indicates a 20-dollar analysis, whereas for a 20,000-dollar house it would specify 
a 200-dollar analysis. A l-million-dollar decision would justify 10,000 dollars* 
worth of analysis or, let us say, about three man-months. The initial reaction to 
this guideline has been that it is conservative in the sense of not spending much 
on analysis; yet, when we apply it to many decisions now made by business and 
government, the reaction is that the actual expenditures on analysis are only 
one-tenth or one-hundredth as large as the rule would prescribe. Of course, 
we can all construct situations in which a much smaller or larger expenditure 
than given by the rule would be appropriate, and each organization can set its 
own rule, perhaps making the amount spent on analysis nonlinear in the re- 
sources to be allocated. Nevertheless, the 1 percent figure has served well to 
illustrate where decision analysis can be expected to have the highest payoff. 

The professional nature of the decision analyst becomes apparent when he 
balances realism in the various parts of the decision-analysis model. Here he 
can be guided only by what used to be called engineering judgment. One 
principle he should follow is to avoid sophistication in any part of the problem 
when that sophistication would not affect the result. We can describe this 
informally by saying that he should strive for a constant “wince” level as he 
surveys all parts of the analysis. One indication that he has achieved this state 
is that he would be tom among many possibilities for improvement if wc 
allowed him to devote more time and resources to the decision model. 


4. THE ENCODING OF SUBJECTIVE INFORMATION 

One unique feature of decision analysis is the encoding of subjective infor- 
mation, both in the form of risk aversion and in the assignment of probabilities. 

4.1. Risk Aversion and Time Preference 

Since we are dealing in most cases with enterprises rather than individuals, 
the appropriate risk aversion and time preference should be that of the enter- 
prise. The problem of establishing such norms is beyond our present scope. 
It is easy, however, to demonstrate to managers, or to anyone else for that 
matter, that the phenomenon of risk aversion exists and that it varies widely 
from individual to individual. One question useful in doing this is, “ How muen 
would you have to be paid to call a coin, double or nothing, for next years 



DECISION ANALYSIS: APPLIED DECISION THEORY 


65 


salary?” Regardless of the salary level of the individuals involved, this is a 
provocative question. Wc point out that only a rare individual would play such 
a game for a payment of zero and that virtually everyone would play for a 
pa} 7 nent equal to next year’s salary, since then there would be nothing to lose . 
Thereafter we are merely haggling over the price. Payments in the range of 
60 percent to 99 percent of next year’s salary seem to satisfy the vast majority' 
of professional individuals. 

The steps required to go from a realization of personal risk aversion and time 
preference to corporate counterparts and finally to a reward system for managers 
that will encourage them to make decisions consistent with corporate risk 
aversion and time preference remain a fascinating area of research. 

4.2. Encoding of Uncertainty 

When we begin the probabilistic phase of the decision analysis, we face the 
problem of encoding the uncertainty in each of the crucial state variables. 
We shall want to have the prior probability distributions assigned by the people 
within the enterprise who are most knowledgeable about each state variable. 
Thus the priors on engineering variables will typically be assigned by the 
engineering department; on marketing variables, by the marketing department, 
and so on. However, since we arc in each case attempting to encode a probability 
distribution that reflects a state of mind and since most individuals have real 
difficulty in thinking about uncertainty, the method we use to extract the priors 
is extremely important. As people participate in the prior-gathering process, 
their attitudes are indicated successively by, “This is ridiculous,” “It can’t be 
done,” “ I have told you what you want to know but it doesn’t mean anything,” 
“Yes, it seems to reflect the way I feel,” and “Why doesn’t everybody do this ? 99 
In gathering the information we must be careful to overcome the defenses the 
individual develops as a result of being asked for estimates that are often a 
combination of targets, wishful thinking, and expectations. The biggest diffi- 
culty is in conveying to the man that you arc interested in his state of knowledge 
and not in measuring him or setting a goal for him. 

If the subject has some experience with probability, he often attempts to 
make all his priors look like normal distributions, a characteristic we may 
designate as “bellshaped” thinking. Although normal distributions are appro- 
priate priors in some circumstances, we must avoid making them a foregone 
conclusion. 

Experience has shown certain procedures to be effective in this almost 
psychoanalytic process of prior measurement. The first procedure is to make 
the measurement in a private interview to eliminate group pressure and to over- 
come the vague notions that most people exhibit about matters probabilistic. 
Sending around forms on which the subjects arc supposed to draw their priors 
has been worse than useless, unless the subjects were already experienced in 
decision analysis. 

Next wc ask questions of the form, “What arc the chances that x will exceed 
10,” because people seem much more comfortable in assigning probabilities to 
events than they are in sketching a density function. As these questions are 



RONALD A. HOWARD 


66 

asked, we skip around, asking the probability that # will be “ greater than 50, 
less than 10, greater than 30,” often asking the same question again later in the 
interview. The replies are recorded out of the view of the subject in order to 
frustrate any attempt at forced consistency on his part. As the interview pro- 
ceeds, the subject often considers the questions with greater and greater care, 
so that his answers toward the end of the interview may represent his feelings 
much better than his initial answers. We can change the form of the questions by 
asking the subject to divide the domain of the random variable into n mutually 
exclusive regions with equal probability. (Of course, we would never put the 
question to him that way.) We can use the answers to all these questions to 
draw the complementary cumulative distribution for the variable, a form of 
representation that seems easiest to convey to people without formal prob- 
abilistic training. 

The result of this interview is a prior that the subject is willing to live with, 
regardless of whether we are going to use it to govern a lottery on who buys coffee 
or on the disposal of his life savings. We can test it by comparing the prior with 
known probabilistic mechanisms; for example, if he says that a is the median 
of the distribution of x 9 then he should be indifferent about whether we pay him 
one hundred dollars if * exceeds a or if he can call the toss of a coin correctly. 
If he is not indifferent, then we must require him to change a until he is. The 
end result of such questions is to produce a prior that the subject is not tempted to 
change in any way, and we have thus achieved our final goal. The prior-gathering 
process is not cheap, but we perform it only on the crucial state variables. 

In cases in which the interview procedure is not appropriate, the analyst 
can often obtain a satisfactory prior by drawing one himself and then letting the 
subject change it until the subject is satisfied. This technique may also be useful 
as an educational device in preparation for the interview. 

If two or more variables are dependent, we must gather priors on conditional 
as well as marginal distributions. The procedure is generally the same but 
somewhat more involved. However, we have the benefit of being able to apply 
some checks on our results. Thus, if we have two dependent variables x and y, 
we can obtain the joint distribution by measuring the prior on x and the con- 
ditional on y, given x, or, alternatively, by measuring the prior on y and the con- 
ditional on .v, given y. If we follow both routes, we have a consistency check on 
the joint distribution. Since the treating of joint variables is a source of expense, 
we should formulate the problem to avoid them whenever possible. 

To illustrate the nature of prior gathering we present the example shown 
in Figure 3. The decision in a major problem was thought to depend primarily 
on the average lifetime of a new material. Since the material had never been 
made and test results would not be available until three years after the decision 
was required, it was necessary to encode the knowledge the company now had 
concerning the life of the material. This knowledge resided in three professional 
metallurgists who were experts in that field of technology. These men were 
interviewed separately according to the principles we have described. They 
produced the points labeled u Subjects 1, 2, and 3” in Figure 3. These results 
have several interesting features. We note, for example, that for t = 17 Subject 



DECISION ANALYSIS: APPLIED DECISION THEORY 


67 



0 5 10 15 20 25 30 

t 


Figure 3. Priors on lifetime of material. 


2 assigned probability 0.2 and 0.25 at various points in the interview. On the 
whole, however, the subjects were remarkably consistent in their assignments. 
We observe that Subject 3 was more pessimistic than Subject 1. 

At the conclusion of the three interviews the three subjects were brought 
together and shown the results. At this point a vigorous discussion took place. 
Subjects 1 and 3, in particular, brought forth information of which the other two 
members of the group were unaware. As the result of this information exchange, 
the three group members drew the consensus curve — each subject said that this 
curve represented the state of information about the material life at the end of the 
meeting. 

It has been suggested that the proper way to reconcile divergent priors is 
to assign weights to each, multiply, and add, but this experiment is convincing 
evidence that any such mechanistic procedure misses the point. Divergent 
priors arc an excellent indicator of divergent states of information. The ex- 
perience just described not only produced the company’s present encoding of 
uncertainty about the lifetime of the material but at the same time encouraged 
the exchange of information within the group. 


5. A DECISION-ANALYSIS EXAMPLE 

To illustrate the flavor of application let us consider a recent decision analysis 
in the area of product introduction- Although the problem was really from 
another industry, let us suppose that it was concerned with the development 
and production of a new type of aircraft. There were two major alternatives: 
to develop and sell a new aircraft (A*) or to continue manufacturing and selling 




68 


RONALD A. HOWARD 



Figure 4. Decision analysis for new product introduction. 


the present product (Ai). The decision was to be based on the present value of 
future expected profits at a discounting rate of 10 percent per year. Initially, 
the decision was supposed to rest on the lifetime of the material for which we 
obtained the priors in Figure 3; however, a complete decision analysis was 
desired. Since several hundred million dollars in present value of profit were at 
stake, the decision analysis was well justified. 

The general scheme of the analysis appears in Figure 4. The first step was 
to construct a model of the business, a model that was primarily a model of the 
market. The profit associated with each alternative was described in terms of 
the price of the product, its operating capital costs, the behavior of its competi- 
tors, and the national characteristics of customers. The actual profit and dis- 
counted profit were computed over a 22-year time period. A suspicion grew 
that this model did not adequately capture the regional nature of demand. 
Consequently a new model was constructed that included the market charac- 
teristics, region by region and customer by customer. Moving to the more 
detailed basis affected the predictions so much that the additional refinement 
was clearly justified. Other attempts at refinement, however, did not affect the 
results sufficiently to justify a still more refined model. Now, the sensitivity 
analysis was performed to determine the crucial state variables, which turned 
out to be the operating cost, capital cost, and a few market parameters. Because 
of the complexity of the original business model, an approximate business model 
essentially quadratic in form was constructed to show how profit depended on 
these crucial state variables in the domain of interest. The coefficients of the 
approximate business model were established by runs on the complete business 
model. 

The market priors were directly assigned with little trouble. However, 
because the operating and capital costs were the two most important variables 







DECISION ANALYSIS: APPLIED DECISION THEORY 


69 


in the problem, these priors were assigned according to a more detailed pro- 
cedure. First, the operating cost was related to various physical features of the 
design by the engineering department. This relationship was called the oper- 
ating-cost function. One of the many input physical variables was the average 
lifetime of the material whose priors appear in Figure 3. All but two of the 
12 physical input variables were independent. The priors on the whole set of 
input variables were gathered and used with the operating-cost function in a 
Monte Carlo simulation that produced a prior for the operating cost of the 
product. 

The capital-cost function was again developed by engineering but was 
much simpler in form. The input certainties were the production costs for 
various parts of the product. Again, a Monte Carlo analysis produced a prior 
on capital cost. 

Once we had established priors on all inputs to the approximate business 
model, we could determine the profit lottery for each alternative, in this case 
by using numerical analysis. 

The present-value profit lotteries for the two alternatives looked very 
much like those shown in Figure 1. The new product alternative A 2 sto- 
chastically dominated the alternative A 1 of continuing to manufacture the present 
product. The result showed two interesting facets of the problem. First, it 
had been expected that the profit lottery for the new product alternative would 
be considerably broader than it was for the old product. The image was that of 
a profitable and risky new venture compared with a less profitable but less risky 
standard venture. In fact, the results showed that the uncertainties in profit 
were about the same for both alternatives, thus showing how initial concepts 
may be misleading. 

The second interesting facet was that the average lifetime of the material 
whose priors appear in Figure 3 was actually of little consequence in the de- 
cision. It was true enough that profits were critically dependent on this lifetime 
if the design were fixed, but if the design were left flexible to accommodate to 
different average material lifetimes profits would be little affected. Furthermore, 
leaving the design flexible was not an expensive alternative; therefore another 
initial conception had to be modified. 

However, the problem did not yield so easily. Figure 5 shows the present 
value of profits through each number of years t for each alternative. Note that 
if wc ignore returns beyond year 7 the new product has a higher present value 
but that if we consider returns over the entire 22-ycar period the relationship 
reverses, as wc have already noted. When management saw these results, they 
were considerably disturbed. The division in question had been under heavy 
pressure to show a profit in the near future — alternative A 2 would not meet that 
requirement. Thus the question of time preference that had been quickly 
passed off as one of present value at 10 percent per year became the central issue 
in the decision. The question was whether the division was interested in the 
quick kill or the long pull. At last report the division was still trying to convince 
the company to extend its profit horizon. 

This problem clearly illustrates the use of decision analysis in clarifying the 



70 


RONALD A. HOWARD 



Year t 

Figure 5. Expected present value of profit. 


issues surrounding a decision. A decision that might have been made on the 
basis of a material lifetime was shown to depend more fundamentally on the 
question of time preference for profit. The nine man-months of effort devoted 
to this analysis were considered well spent by the company. The review com- 
mittee for the decision commented , li We have never had such a realistic analysis 
of a new business venture before.” The company is now interested in insti- 
tuting decision-analysis procedures at several organizational levels. 




DECISION analysis: applied decision theory 


71 


6. CONCLUSION 

Decision analysis offers operations research a second chance at top manage- 
ment. By foregoing statistical reproducibility \vc can begin to analyze the 
onc-of-a-kind problems that managers have previously had to handle without 
assistance. Experience indicates that the higher up the chain of management 
we progress the more readily the concepts we have outlined are accepted. A 
typical reaction is, “ I have been doing this all along, but now I see how to reduce 
my ideas to numbers.’* 

Decision analysis is no more than a procedure for applying logic. The 
ultimate limitation to its applicability lies not in its ability to cope with problems 
but in man’s desire to be logical. 


ANALYSE DES DECISIONS: THEORIE 
APPLIQUEE DES DECISIONS 


r£SUM£ 

Au cours de ccs demieres annees, la theorie dc decision a etc de plus cn plus 
acceptee en tant que cadre conceptuel pour la prise de decision. Cependant, 
ccttc theorie a surtout affecte les statisticicns plutot que les personnes qui cn 
ont 1c plus besoin : les responsablcs dc decisions. Cette etude decrit un procede 
qui permet de rcplaccr des problemcs de decision reels dans la structure de la 
theorie dc decision. Le procede d’analyse de derision cnglobe chaque etape, 
du mesurage des choix de risques et des jugements portant sur des facteurs 
critiques par rdtablissemcnt dc structures des facteurs relatifs a la technique, 
au march e, h la rivalite commcrcialc et a renvironnement, jusqu'au mesurage 
des preferences subjectives et de la valeur dc la prediction. L’analyse de decision 
met cn perspective Ics nombreux instruments dc simulation, d’analyse nu- 
merique, et de transformations de probability qui deviennent de plus cn plus 
commodes depuis 1c dcveloppcmcnt des systemes d’ordinateurs electroniqucs 
dont les differentes “stations” dependent d’une “ccntralc” unique. 

Lc procede cst applique a un probldmc dc decision rcelle qui s’etend sur des 
dizaincs d’annes et dont la valeur actuclle cst dc plusicurs centnincs de millions 
dc dollars. Cette etude analyse le problcme dc la determination des depenscs 
consacrees h 1’analysc de decisions. L’unc des plus importantes propriety 
dc cc precede tient au nombre des benefices auxiliaircs crees au cours de l’elabor- 
ation dc cc genre d’etude. L’expericnce montre que ccs benefices peuvant 
excedcr cn valeur le cout des depenses consacrees a Telaboration de la decision. 



72 


HENRI MARTY 


APPLICATION DE LA THEORIE DE LTNFORMATION 
A UN MODELE DE CHOIX SEQUENTIAL 
DTNSPECTIONS 

The Application of Information Theory to a 
Model of Sequential Selection of 
Inspection Methods 

Henri Marty 

Centre Intcrarmccs de Recherche Opcrationncllc , France 


L’etudc cntrcprisc ici sc place dans le cadre plus general d’un mondc ou 
cxistcrait line Agence Internationale de Desarmcmcnt. A cc moment, les pays 
signataires dc I'nccord de controle dcs armements verraient ccrtaincs productions 
a usage militaire (engins strategiques cn particulier) interdites ou contingentces. 

Nous nous proposons, dans cctte optique, de definir unc methodc objective 
de choix dcs points dc controle ninsi qu'uo critere quantitatif pour mesurer la 
connaissnnce que Ton a des activates des pays soumis a cc traitc dc controle des 
armements. 

Nous commenccrons par precise r les definitions ct hypotheses du modele; 
nous introduirons alors I’cntropie ct les statistiques dc Bayes ct les utiliserons 
pour choisir mi controle par decisions sequentiellcs. 

Enfin, nous donnerons un aper^u dcs modelcs dc simulation destines a 
illustrer ct experimenter cc modele theorique. 

1. DEFINITIONS 

1.1. Gen£ralit6s 

1. Unc fraude consistc cn la fabrication secrete dc vehicules nucleaircs. 
II cxistc, memc pour remplir unc mission donnec, dcs types varies d’engins 
possibles; dc plus un m£mc type pent etre fnbrique scion dcs processus divers. 

On admettra done facilcment que Ton peut definir un certain nombre de 
processus dc fabrication possibles. 

2. Un processus sera dtifini commc etant un ensemble $ act with] celles-ci 
titan t ddlimitccs par des considerations techniques. 

Si nous avons rnontre que les engins pouvaient ctre tr ts differents tech- 
niquement les uns des autres, nous devons aussi remarquer qu’il peut existcr 
dcs activites communes it plusicurs processus. II cn sera ninsi lorsquedeux 
engins utiliscront par cxcmplc le memc ler etage dc propulsion ou la m£me 
plate-forme de guidage ou le mcme corps de rcntrec pour prottiger la charge 
militaire lors du rctour dans T atmosphere. 



APPLICATION DE LA THEORIE DE L* INFORMATION 


73 


Schcmadquemcnt, les diffcrentes manieres dont un pays peut construirc 
(fraudulcusemcnt ou non) differents types d’engins balisdqucs, peut se repre- 
scntcr ainsi (figure 1): 


Ntveaux 



Nivemtx 

1. Elaboration des maticrcs premieres. 

2. Preparation des materiaux et produits dc premiere transformation. 

3. Fabrication des pieces et organes elementaires. 

4. Production dcs ensembles et organes principaux. 

5. Montage dcs sous-cnsembles. 

6. Assemblage final ct tests de fonctionnement. 

Figure 1 

3. Un contrCle consistera cn un ensemble d’ inspect ions~umtai'res (nous 
dirons simplcment inspections dans la suite de Pcxposc), Punite d’inspecdon 
etant definie par un certain cout, ou mieux par le travail d’une equipc de con- 
troleurs pendant une durec determinee. Deux inspections seront considerees 
comme distinctes si dies portent sur des activites diffcrentes ou, si portant sur 
les mcmes activites, ellcs emploient dcs methodes de controle diffcrentes. 

Naturellcment une meme inspection peut figurcr plusieurs fois dans un 
controle. 

1.2. Caracterisation d’une Inspection 

Une inspection (unitaire) portant sur une certainc activite ne donne pas en 
general la certitude qifil v a fraude ou non fraude. 

La notion dc qualite de Pinformadon apportce par une inspection doit done 
ctrc preciscc. Nous supposerons que Ton peut definir une probability p cor- 
respondant a la probability de detecter tme fraude par une inspecdon lorsqu’il 
y a cffcctivcmcnt fraude sur Pactivite inspectec, ct egalemcnt unc probability 
q de conclurc a non fraude lorsqu’il y a effectivement non fraude. 



74 


HENRI MARTY 


Conclusion 


Etat de la Realite 



F 

NF 


(fraude) 

(non fraude) 

F (fraude) 

P 

1-9 

NF (non fraude) 

i -p 

Q 


Figure 2 


Les relations entre la realite et le resultat de ^inspection peuvent alors etre 
resumees par le tableau ci-apres dont on remarquera l’analogie avec la matrice 
de brouillage d’une transmission par ligne (figure 2). 

Le calcul effectif des valeurs de p et q sera plus ou moms facile selon la 
nature de Tinspection consideree. 

Nous avons tenu compte dans un premier modele de ce que, lorsqu’une 
inspection est realisee plusieurs fois, les resultats des inspections precedentes 
augmentent sa qualitd (p-> 1, 1). 

remarque: Plus generalement, nous devrions retenir trois conclusions possibles 
a une inspection: F (fraude), NF (non fraude) et? (impossibility de conclure). 
Ceci nous conduirait a un tableau tel que celui presente ci-apres (figure 3). 

Nous avons admis, afin de simplifier le modele et bien qu’inclure ce raffine- 
ment ne presente aucune difficult^, que Timpossibilite de conclure se traduisait 
par non fraude et nous avons regrouped et en 1 —p ainsi que et qz en q . 

1.3. Choix d’une Partition 

On peut diviser Fensemble du REEL en un plus ou moins grand nombre 
d’etats, suivant que Pon veut obtenir des renseignements tres precis ou non. 
La seule condition que nous devons imposer a cette division et qu’elle constitue 
une PARTITION. 


Etat de la Realite 



Conclusion 



F 

NF 

F 

pi 


NF 

p2 

42 

? 

p3 

43 


Avec: 

i >=i 


Figure 3 



APPLICATION DE LA THEORIE DE L’lNFORMATION 


75 


Donnons quelqucs exemples de partitions possibles: 


En deux etats 


fraude 
non fraude 


En quatre etats 


r non fraude 

fraude suivant le processus I seul 
fraude suivant le processus I et un ou plusieurs autres 
fraude suivant un ou plusieurs des processus autres 
que I. 


En 2 m etats 

(s’il y a m processus) 


/non fraude 

fraude suivant le processus I seul 
fraude suivant le processus II seul 

S fraude suivant les processus I et II seuls 
fraude suivant les processus I et III seuls 


' fraude suivant tous les processus 


Nous n’insisterons pas sur Tinteret de la partition la plus complete en 2 m 
dtats par rapport a celle en deux etats (F et NF); on sait en effct que la partition 
la plus fine est celle qui permet de ne pas perdre Tinformation apportee par la 
connaissance des differences qui existent entrc les processus. 

Par contre, dans la pratique, le manque d’in formations sur ce qui differencie 
deux processus nous obligera a les confondre et h. adopter une partition qui 
n’atteindra pas le degre de decomposition maximum. 


2. METHODE DE CHOIX DES INSPECTIONS SUCCESSIVES 
2.L Gendralitds 

L’esprit de la methode est le suivant: a un instant donne, nous nous faisons 
unc ccrtainc idee de la reaiite que nous representons par un vccteur P dont les 
composantes P* sont les probability attachees aux differents etats du rdel 
distingues dans la partition choisie. 

A chaque vccteur P nous attachons Tentropie: 

//(/?) = - 2 P*iogP* 

i-c/r 

Nous savons par ailleurs (section 1.1) qu’un controle consiste en une suite 
d’inspcctions-unites, toutes de mcme cout. Lorsque 1c resultat d*une inspection 
donnera pour la premiere fois a Tentropie une valeur inferieure a un certain 
scuil Hz(R) y le controle sera termine: le nombre des inspections composant un 
controle est done variable. 

Le cout moycn du controle sera minimum si, par une methode appropri£c 
de choix successes, on sait choisir les inspections a cffectuer de maniere a 
minimiscr leur nombre moyen. 

Comment y pan'cnir? 



76 


HENRI MARTY 


1. Une premiere methode consiste k choisir, a chaque etape, l’inspection 
qui apporte rinformation moyenne maximum. Cette methode serait meme la 
solution exacte de notre probleme si reformation apportee par une inspection 
etait independante des probabilites des differents etats du r£el (vecteur P) au 
moment ou cette inspection avait lieu. Ce n’est evidemment pas le cas (remar- 
quons meme qu’alors notre modele ne presente rait plus aucun interet puisque 
la meme inspection serait choisie de nouveau a chaque etape etant toujours la 
meilleure); Fapport moyen d’information d’une inspection est en effet fonction 
des choix faits pour les inspections precedentes et du resultat auquel elles ont 
abouti. C^est pour cette raison que l’information esperee apportee par deux 
inspections successives n’est pas obligatoirement maximum si Ton choisit dans 
un premier temps Tinspection donnant a ce moment le plus d’information 
moyenne et en recommen^ant ce meme choix a Tetape suivante. 

2. Nous sommes ainsi amenes a envisager une autre methode. 

Cette methode est illustree par la figure 4. Elle consiste a se demander quelle 
inspection choisir pour la deuxieme etape si a la premiere etape nous avons 
choisi A et obtenu le resultat fraude {Ay) ou bien si nous avons choisi A et 




i + 2 


i*fl 


Figure 4 







APPLICATION DE LA THEORIE DE L’iNFORMATION 


// 

obtcnu non fraudc (Ayy), Dans cc but, nous comparons entre elles toutcs Ics 
informations moyennes apportees par des triplets de la forme {Ac B ) dont le 
sens est: 


le re inspection — A 


2eme inspection = 


IB 

\C 


si A ■=> A y 
s i A => Asf 


Cette comparaison permet de trouver le triplet 1c plus avamageux et nous 
cn concluons que nous devons commenccr par A . L’inspection A ctant efiec- 
tuec et ayant donne le resultat fraudc, par exemplc, nous ne ferons pas obliga- 
toirement ensuite Tinspection B\ en effet nous allons rccommencer le meme 
processus afin de determiner, en fonction de ce dernier resultat, le meilleur 
triplet pour les inspections de la 2eme et de la 3eme etapes. 

Cette methode peut etre perfectionnee en prevoyant les inspections a faire 
3, 4, n etapes a l’avancc; on est cependant limite par Pimportance des 
calculs a faire h. chaque ctape. D’autre part, le gain a attendre du choix de la 
premiere inspection a faire a partir d’une prevision (n 4 - 1) etapes a Tavance, au 
lieu d’une prevision a n etapes, tend rapidement vers zero quand n augmente; on 
se contentera done dans la pratique d’une prevision a 2 ou 3 etapes. 

Remarquons a ce propos que Thorizon de calcul est infini: il peut en effet y 
avoir des controles de longueur infinie (mais de probability nulle); nous ne pou- 
vons done envisager de calculs exhaustifs. 

Pour la memo raison, le nombre d’etats finals (nombre de vecteurs P pos- 
sibles en fin dc controle) est egalement infini; ceci nous enleve done a priori 
la possibility d’utiliser un algorithme du type Huffman-Picard (theorie des 
questionnaires), algorithme qui part des demieres questions a poser pour 
remontcr aux premieres. De plus, nous devons noter que Picard montre que le 
gain apporte par une telle methode compare a celui obtenu par la methode qui 
consiste a maximiser Tinformation moyenne a chaque question est assez faible. 

En resume, prevoir 2 ou 3 etapes a l’avancc nous semble, a fortiori, unc 
bonne approximation de la solution optimale (dont nous avons vu qu’elle est 
incalculable). 

Pour les modeles que nous avons constructs jusqu’a present, nous nous 
sommes contentes de choisir a chaque ctape, parmi toutes les inspections, celle 
qui en moyenne apportait la mcillcure information. 


2.2. Expose de la Methode de Choix 

2.2. /. Donne es 

Les donnees a introduce dans le modele consistent en la liste des processus 
dc fraudc ct dcs activites s’y rattachant ct ccllc dcs inspections propres a chacunc 
des activites. 

I>es processus ct les activites peuvent se resumer dans un tableau. La 
figure 5 donne un exemplc de prinripc du tableau rep resen tan t 3 processus scion 
la partition la plus fine cn 2 n etats. 



78 


HENRI MARTY 


Processus 



nota: Les croix indiquent les diverses activities appartenant aux processus. 

Figure 5. 

En ce qui concerne les inspections, elles sont classees par activite et pour 
chacune on indique les probabilites qui y sont attachees; par exemple: 

(A avec: p A , q A 
activite a inspections l A* Pa'^Qa' 

[A" Pa*>Qa" 

On peut egalement inclure dans ces donnees le fait que, lorsque Ton effectue 
une nouvelle inspection A , on dispose deja des resultats de la ou des precedentes, 
ce qui apporte une meilleure efficacite que Ton traduit par: 

inspection A: lere inspection: pA x ,qA x 
26me inspection: pA z ,qA« 


avec: p At <p Az <1 

qA,<qA,< 1 


2.2.2. Choix Initiaux 

Ayant determine les activites et les inspections s’y rapportant, nous devons 
choisir une partition du reel et un vecteur Po initial representant les probabilites 
subjectives des differents etats du reel au moment ou commence le controle. 

Choix d } une partition 

Ce choix depend des possibilites qui nous sont donnees soit dans le cadre de 
Vaccord de desarmement soit par les possibilites techniques d’inspection. Nous 
en deduisons un tableau donnant les inspections portant sur les differents etats 
du reel. 



APPLICATION DE LA THEORIE DE L INFORMATION 


79 


Si, par exemplc, nous avons choisi la partition en 4 etats: 

L Fraude suivant le processus I. 

2. Fraude suivant I et au moins un autre. 

3. Fraude mais pas suivant I. 

4. Non fraude. 

Nous obtcnons le tableau suivant (figure 6): 


Activites 

et 

inspections 


Etats du reel 


1 

(1) (2) (3) (4) 

a 

f ^ 

X 

X 

X 

A' 

I . 

j X X X 

1 A * 

X 

X 

X 

b \ 

f 5 

X X 

i B' 

X X 

c { C 

X X 

' 1 

f D 

XXX 

1 D' 

XXX 


Choix du red cur initial Po 

Si Ton penchc vers un etat plutot que vers un autre (cn raison de renseigne- 
ments anterieurs), le vecteur Po en temoignera. 

Sinon, on choisira pour Po, le vecteur correspondant a l’equiprobabilite des 
etats du reel ct a la valeur maximum de Ce choix cst discutable mais on 

ne peut en imaginer d’autre. 

2,2.3, Critere de Choix de la (i + 1 )cme Inspection 

nota. Afin de rendre les formules plus lisibles, nous avons remplace les pj et 
<7/ rclatifs a la qualite de Pinspection J par a; ct 

Aprcs i inspections, le vecteur des probabilites des differents etats a pour 
valeur Pi , 1’cntropie est II{(R). 

Pour chacunc des inspections A ? A\ A\ B , ...» J \ nous allons calculer 
Tinformation moyenne apportec. 

Faisons-le pour I’inspcction J. L’cnsemble K des etats du possible peut 
etre divise cn Kj , ensemble de ces etats correspondant a une fraude dans Pactivitc 
j controlcc par /, et Kj ensemble de ecs etats correspondant a non fraude dans 
ccttc activite. 



80 


HENRI MARTY 


Nous pouvons tout d’abord calculer les probability Qi+i(JP) et Q 
d’obtenir les resultats fraude et non fraude en faisant Tinspection J: 

Qm(Jf) = oc] 2 A*+(i-ft) 2 Pi k 

k g Kj k G JLj 

Qi+l{Jm) = (1 — oy) 2 Pi* + p } 2 Pi k 

ke Kj ke K' } 

aj et ont pour valeurs respectives ocji et fiji si l’inspection J n’a j 
ete faite, ou a.j 2 et si elle a deja eu lieu une fois etc. 

Calculons maintenant Pi+uj F nouvelle valeur du vecteur Pi , la (z + 
inspection ayant donne le resultat fraude en J, etPj + i j NF la (£+ l)eme it 
tion ayant donne non fraude. 

Pi + i t j F a pour &ie me composante Pi+ij F qui vaut: 

-Urn pour heK ’ 

Pi+l,Jr =- n - 2 7 \ P our keK 'j 

xU+iUt) 

De meme Pm,</ NF a pour &ieme composante: 

Pi+l,J KF = P 0Ur keKl 

R . . pjc 

Pi+u* r pour keK 'i 

Nous en deduisons les valeurs de: 

1,Jf C^) = 2 l°g Pf+Ur 

k e K 

+ (R) = 2 Pi + 1,/nf Pi + l,Jvr 

ke K' 

et done de Pentropie moyenne apres controle en J: 

Hi+l, j(P) — Qt+l{Jf) Hi+1,J F (R) + 0I4-x(/nf) Hi+hJxr(P) 

On effectue le meme calcul pour toutes les inspections. On choisit alors X 
telle que H {+ ix soit la plus petite de toutes les entropies moyennes calculees. 

2,2.4. Pour suite du Controle 

La (i -f- l)eme inspection choisie, elle a lieu effectivement et donne le resultat 
fraude ou le resultat non fraude. 

Si e’est fraude : On prend Pm,x F pour valeur de Pm, et done 
#m,x F (P) pour valeur de Hi+i{R). 

Si e'est non fraude : On prend Pm>x NF pour valeur de Pm ? et done 
Ht+hx^AR) pour valeur de H i+ i(R). 

On peut alors passer a Petape suivante qui est le choix de la (i + 2)eme 
inspection. 



APPLICATION DE LA THEORIE DE lTNFORMATION 


81 


2.2.5. Fin du ContrCle 

Pour tester s’il y a lieu ou non de poursuivre le controle, on compare H^\{R) 
et Ic seuil d’efficacite fixe He(R). 

Si Ili^(R) > He(R): On passe a P<hape suivante. 

Si H(^i{R)<He{R): Fin du controle. 

L’etat du possible realise correspond a ia plus grande composante de Pt±i. 

3. LES MODELES DE SIMULATION 

Nous avons construit plusieurs modeles de simulation en application de la 
methode que nous venons de decrirc. Chacun d’eux avait un but bien determine. 

II s’agissait d’abord d’etudier les problemes de convergence de la methode 
puisque theoriquement cette convergence n’etait pas assuree. Nous avons pu 
nous rendre compte que nous arrivions toujours a conclure au bout d’un nombre 
d’inspcctions raisonnable. 

Ensuite, nous avons etc conduit dans un but pratique, a etudier comment se 
comportait un tel modele en face d’une realite fluctuante, e’est-a-dire en face 
d'un pays fraudant d r une maniere intelligente. Ceci nous a conduit a entre- 
prendre une etude de ponderation des resultats passes du controle afin que le 
controlcur puisse se Iiberer progressivement des resultats anterieurs des in- 
spections ct ainsi suivre plus aisement Paction evasive du fraudeur. 

Enfin, nous avons entrepris recemment une application de la methode a 
un programme reel de fabrications d’armement en lui adjoignant une etude 
dynamique des stocks, des mises en fabrication, etc. 


THE APPLICATION OF INFORMATION THEORY 
TO A MODEL OF SEQUENTIAL SELECTION 
OF INSPECTION METHODS 

r£sum£ 

The Problem: Inspection for Arms Control 

In this article we define several processes a country may employ to build 
clandestine stockpiles of armaments. We want to know which process is actually 
used. For this purpose the Control Agency has chosen several types of inspec- 
tion, more or less effective, each type being capable of controlling one or several 
processes. 

The Model 

At each step wc have an image of the activities of the controlled country 
(vector P, the ith element of which represents the probability for the fth process 
to be used). 



82 


G. MERGES AM) H. DIEHL. 


From this point we try to determine the best next inspection (the one that 
gives us the greatest information, measured by the entropy). Once this choice has 
been made, we have the best inspection and, knowing its result, we re-evaluate 
the vector P, using the Bayesian principle. 

We let the model go until the entropy reaches a value such that vre think we 
know quasi-perfectly which process the controlled country has been using. 

Illustration of the 7\Iodel 

We describe briefly some simulation models we built to apply the theoretical 
model to actual problems. 


ON THE APPLICATION OF FIDUCIAL 
PROBABILITY TO STATISTICAL DECISIONS 

Sur V Application des Probability Fiduciales aux 
Decisions Siatistiqnes 

G. Merges and H. Diehl 

Seminar fur theoretische Statistik und Okonometrie 
Saarbrucken , Germany 


i. THE BASIC MODEL 

A decision maker has certain actions a out of an action space A at his disposal, 
and he faces a certain set Cl of possible states of nature p. 

To each combination of an 2 dion as A and a state of nature p there cor- 
responds unambiguously a real-valued loss L{a, p), given by a bounded func- 
tion, the so-called loss function L over the product space Ax Cl: 


-Ifa, p}- 


For the following we assume that Cl is a real one-dimensional parameter space 
of a certain continuous distribution function /(x; p) of a random variable X 




ON THE APPLICATION OF FIDUCIAL PROBABILITY TO STATISTICAL DECISIONS 83 


which can take values * in a sample space SC. The functional form / is known. 

The decision maker has the possibility of gathering information concerning 
/x by observing realizations x of A' in SC. Furthermore, we allow the unknown 
state fi itself to be a random variable with a certain continuous distribution 
function g(/x; 0), depending on a real-valued parameter 0 out of a space 0. 
The functional form g is known; 0 is called an a priori distribution over O; 
0 is the space of the a priori distributions 0 over 0. 

If the decision maker knows which /x e 0 is the true one, he has to make a 
decision under certainty. This case, however, is excluded from our consider- 
ations. We are dealing only with uncertainty; that is, the true state of nature is 
unknown. 

The uncertainty can have two sources (see [8], p. If): 

1. The first source of uncertainty is randomness: the state of nature (jl 
is a random variable with distribution g(ji; 6). 

2. The second source of uncertainty is lack of knowledge. We distinguish 
two kinds of lack of knowledge: 

(a) The true /x is an unknown constant. Lack of knowledge concerns 
/x directly. 

(b) /x is a g(/x; ^-distributed random variable with unknown 0. Lack 
of knowledge concerns the distribution of /x. 

The decision maker is confronted with one of the following types of un- 
certainty: 

(1) Uncertainty arising from randomness only. 

(2) Uncertainty arising from lack of knowledge of kind (a). 

(3) Uncertainty arising from randomness superimposed by lack of 
knowledge of kind (b). 

The decision problem consists in the selection of an action a* e A, which 
is “optimal.” The definition of optimality is usually incorporated into the 
criterion adopted by the decision maker. 

2. KNOWLEDGE OF G 

2*1. 0 Known (Uncertainty Arising from Randomness Only) 

2.1.1 . Objectively a priori 

If an a priori distribution 0 over the space 0 exists (assumed to be indepen- 
dent of <?) and is objectively known to the decision maker, he then faces the ideal 
case: Bayes criterion is applicable and the decision problem can be solved in a 
brilliant way. An action a* eA is called a Bayes solution with respect to 0 if 

r{a* y 0) = inf r(a t 0), 

ffc.t 


r(a, 0) = j n L(a, p)g(p\ 6) dp. 


where, per definition, 



84 


G. MENGES AND H. DIEHL 


By observing x and by means of the application of Bayes theorem the decision 
maker is even given the possibility of improving the knowledge of p, directly. 
The a posteriori distribution of p, given x is 


p(jx\x) 


g(/*; #)*/(*; p) 

j a gfa‘> #)/(*; p) 


wher cg(jjL; 6) is the known a priori distribution of p, an df(x; p,) is the likelihood 
of p,, given x. 

This ideal case, however, is in practical situations unlikely to occur. A precise 
objective a priori knowledge of 8 is to be expected only with some applications 
in the field in the natural sciences, in which (and only in which) knowledge of 6 
can be gained by means of deduction from known physical (or biological, etc.) 
laws. 


2.1.2 . Subjectively a priori 

If objective (physically deduced) a priori knowledge of 8 is not available, 
as in most applications of decision theory in the field of the social sciences, the 
decision maker can try to establish 8 subjectively. There may be some doubts of 
the scientific justification and the empirical validity of personal probabilities 
[5]. In any case, it will hardly ever be possible to produce a precise knowledge 
of the true 9 a priori subjectively. How, then, should the derision maker pro- 
ceed if the subjective knowledge of 8 is not precise enough for the application 
of Bayes criterion or if he has scruples to apply personal probabilities at all? 
In order to approach the answer to this question, let us distinguish two cases: 
total and partial lack of knowledge concerning 6 . 


2.2. 0 Unknown (Uncertainty Arising from Randomness 
Superimposed by Lack of Knowledge) 

2.2.1. Total Lack of Knowledge 

If the lack of knowledge is total and the decision maker is not able or not 
willing to observe an x, there is only one way open to him : the application of 
the (pure) minimax criterion. An action ao e A is called a minimax solution if 

sup L(a $ , jji) = inf sup L(a, ft). 

n an 

The application of the (pure) minimax criterion is independent of any knowledge 
about 0. 

2.2.2. Partial Lack of Knowledge 

If the decision maker knows that the unknown 8 lies in a certain proper 
subset 0* <z 0, he can apply (see [4] and [6]) a certain combination of Bayes 
and minimax criteria, which is called “extended Bayes criterion.” He is recom- 
mended to choose a e A such that 

sup r(a\ 9) = inf sup r(a y 9), 

0c&* aeA0e&* 



ON THE APPLICATION OF FIDUCIAL PROBABILITY TO STATISTICAL DECISIONS 85 
where 

r(a, 9) = L{a, p)g(p; 9) dp. 

The Bayes character of this criterion lies in forming the risks r{a y 6) for each 
0 6 0* and each aeA. The minimax character consists in choosing a* e A 
such that the highest risk is as small as possible. If 0 * contains only one ele- 
ment, the extended Bayes criterion goes over to the pure Bayes criterion. 

2,2.3. Attaining Knowledge by Observations 

The lack of knowledge may be total or partial; if the decision maker has a 
“positive” attitude toward empirical observation, he will observe xeSC and 
try to utilize the observation in order to improve his knowledge of 6. Such a 
utilization can be carried out by means of fiducial probability. 


3. THE CONCEPT OF FIDUCIAL PROBABILITY 

The concept of fiducial probability has been established by R. A. Fisher [1] ; 
(sec also [2], [3], [7], and [9]). Fisher started with the idea to construct from 
given observations a distribution similar to Bayes a posteriori distribution for 
an unknown parameter without making use of any a priori distribution. 

In the typical situation of the basic model (Section 1) we are confronted 
with the same problem. We are given certain observations out of 3\ and we 
would like to make probability statements either about the unknown fie Cl or 
about the unknown 0 6 0, according to the given situation of uncertainty. 

A short introduction to fiducial probability may be useful. Let f(x; fi) be 
the (continuous) distribution function of a random variable A'. When jm is a 
known constant, f(x\ fi) allows probability statements concerning future reali- 
zations of the random variable A'. Now let fi be unknown and let some obser- 
vations .vj, . . . , be given. The problem is to make probability statements 
about the unknown parameter fi on the basis of the given observations x n - 

In other words, how can we derive a distribution function over the space £2 
of /i for given observations xu x n when we know the distribution function 
f(x; fi) over the sample space CC of A'? 

Let T be a sufficient statistic for ft that depends only on the observations 
X], x n : T = T(.vj, . . . , x n ). The distribution function of T is denoted by 
f*(T; ft) and the cumulative distribution function of T is denoted by F{T\ ft). 

If for given T the function 

SF(T;p) 

£p 

over H exists and is a probability measure over £2, it is considered to be die 
density function of the so-called fiducial distribution of the unknown para- 
meter fi. The quantity 


cF(T;/x) 



86 


G. MENGES AND H. DIEHL 


tells us to what degree we should believe — in the light of the observation T— . 
that the unknown p lies in the (infinitesimal) range dp. The distribution func- 
tion 

m? wO 

dp 

over Q can therefore be considered as an objective measure of the uncertainty 
about p , and a (fiducial) probability statement about the unknown p can be 
derived from it. 

In most cases of a single parameter p the fiducial distribution for p exists if 
there is a sufficient statistic T for p. The continuity of f(x\ p) is a necessary 
condition for the existence of a fiducial distribution. 


4. THE APPLICATION OF THE FIDUCIAL ARGUMENT 
TO THE UNCERTAINTY TYPES (2) AND (3) 


4.1. Introduction 

In the case of given observations x out of % the fiducial concept can be 
applied successfully to decision problems in which the uncertainty arises either 
from lack of knowledge concerning p only [type (2) of uncertainty] or from 
randomness of p superimposed by lack of knowledge concerning 8 [type (3) 
of uncertainty]. Type (1) of uncertainty is irrelevant for fiducial considerations, 
for in this case the distribution 8 is given and probability statements under given 
observations can be derived by means of Bayes theorem (see Section 2.2.1). 


4.2. Uncertainty Arising from Lack of Knowledge Concerning p. 

In certain decision problems the unknown state of nature p can be assumed 
to be a constant for which no a priori distribution exists. Uncertainty arises 
from lack of knowledge concerning p only. (Problems of this type occur, for 
instance, in connection with point estimations.) 

We are given n independent observations x±, . . . , x n e 9C, all coming from 
the same distribution f(x ; /x), p being the unknown state of nature. Let T = 
T(x i, . . . , x n ) be a sufficient statistic for p and let f*(T ; p) be the distribution 
function of T , given p. 

What Ave are looking for is the fiducial distribution of p , given T. From 
f*(T ; p) we get the cumulative distribution function of T which we denote by 
F(T; p). If the fiducial distribution for p exists, its density function is given by 


SF(T;p) 

dp 


dp. 


Since this fiducial density is a mathematical measure, we can construct in our 
decision model something like a risk function by weighting the losses L(a , p) 
with | dF{T\ p)j dp\ over £2: 


r(a\T) = j L(a,p) 
J cx 


dF(T; p) 
dp 


dp. 



ON THE APPLICATION OF FIDUCIAL PROBABILITY TO STATISTICAL DECISIONS 87 


The “ risk function” r(a \ T) is logically similar to Bayes risk function, but 
r{a | T) gives risks conditional on the observation T. 

When choosing a* e A such that 

r(a* | T) = inf r(a j T), 

a € A 

Bayes criterion is applied in a very general sense. 

It is the special feature of this solution that a* depends on 7\ so that it may be 
adequate to speak of a 7 -optimal Bayes solution a*(T). 


Example , Let \i be the mean value of an N(ji y 1 )- norma! distribution (o 2 — 1). 
The values X\, . . . , x n have been observed, x — (1 /«)£*< is a sufficient statistic 
for fi and has the distribution function N(ji } !/«), The fiducial density for ft can 
easily be derived as the normal distribution N(x, 1 /;/) over fi. The risk function 
r{a | x) is then given by 


r + 00 / 

r{a (i) = J_ o L{a, /x) ^ j exp 


n(/i —.v)- 


dji. 


4.3. Uncertainty Arising from Randomness of p, Superimposed by 
Lack of Knowledge Concerning 0 

Numerous decision problems arc of the following type: the (unknown) state 
of nature p is a random variable whose distribution 0 is unknown. In this case 
uncertainty arises from both sources, randomness of /x and lack of knowledge 
(concerning 0). 

Some independent observations ,vi, ..., .v n 6 S', all produced by the same 
distribution /(.v; /i), are given in which /x itself is the result of a random 
drawing from the distribution g(ji\ 0 ) with unknown parameter 0 . 

Let T = T(x u . x n ) be a sufficient statistic for /x and let f*(T ; /x) be the 
distribution function of 7\ given /x. In order to throw some light from the 
observed T on the unknown 0, wc must have the distribution function of T in 
dependence of 0 . Wc first establish the joint distribution of (T, /x), given 0: 

r{T;n)g{,i;6). 

Front this wc get the marginal distribution of T, given 0, by integrating out the 
variable /x: 


h{T\ 0) = f f*(T;n)g(n;0)dfi.. 

The distribution h(T; 0) yields the fiducial distribution of 0, given T, assuming 
that it exists at all. Let II(T ; 0 ) be the cumulative distribution function of T f 
given 0 ; then the density of die fiducial distribution of 0 is given by the function 

BH{T\ 0) 

eo 



88 


G. MENGES AND H. DIEHL 



Figure 1 

defined over 0. T as a single observation is assumed to be a sufficient statistic for 0. 
The fiducial distribution function \dH(T\ 9)1 dd\ can be used for the construc- 
tion of an a/-fiducial region 0* c: 0 in which the unknown 9 lies with a given 
(fiducial) probability a/ (a/ close to 1). Considerations may be restricted to the 
proper subset 0* c: 0 because in the light of the observation T it is unlikely 
that 9 will lie outside of 0* (Figure 1). 

If the decision maker considers 1 — a/ small enough and ignores the possi- 
bility of 9 being outside 0*, it is recommended that he apply the extended Bayes 
criterion. Following this criterion (see [6]), he chooses a* e A such that 

sup I L(a*, n)g(jj, ; 6) dji = inf sup f L(a, /x) ^(/x ; 6 ) dp. 

0e0* a € A 0 e 0 * 

If, however, the decision maker fears the possibility of a very disadvantageous 
9 outside 0 *, he may, before applying the extended Bayes criterion, exclude all 
actions a for which 


sup L{a, jx) g(n ; d) dfi > s, 

OeO 


where s is a certain risk limit he does not want to surpass. 
Example (see [8]). We assume that 


f(x; n) = -L exp[-4(* - /x)2], 

V 277 

gfc;6) = J=exp[-U } x-6n 

J2tt 

6 being an unknown constant . We are given the independent observations 
*ii oil coming from the same f(x\ fi). The statistic x = (l jnfLxt is 

sufficient for fi. The distribution of x, given fi, is given by 


/*(*; /-0 = 






OK THE APPLICATION OK FIDUCIAL PROBABILITY TO STATISTICAL DECISIONS S9 
The joint distribution of (x, /x), given 0, is 

^cxp[-^(f- M )8-40 u-W . 

The marginal distribution of x, given 0, is 




exp 




(M~g ) 2 

2 



that is , u/i N(0, 1 + l/fi)-d&/n<fa/iVMi o?;er #*. From A(x; 0) we eon see that 0 
has an N(x f 1 + \ In)- fiducial distribution over 0. 


This fiducial distribution once established in an actual problem would 
allow us to construct any a/-fiducial region 0* for 0. 

As n tends toward infinity, we approach the case of an observation fi being 
drawn from the N(8 , l)-distribution over Cl with unknown 6. In this case 0 
has an A r (x, l)-fiducial distribution. 

Wc can use the fiducial distribution \8H(T;0)/80 | in a second way. 
In order to get a distribution over D, we combine the (known) distribution 
\8H(T; 0)1 d0\ over 0 with the (unknown) distribution g(ji ; 9) over Cl. The 
product g(/x; 0) • \8H(T\ 0)jd0\ represents a joint distribution over 0 xQ. 
(It is remarkable that in this distribution we have in addition to a fiducial 
component a frequentist component.) By integrating out the parameter 0 we 
get the wanted distribution over Cl: 



8H(T; 0) 
30 


dO . 


We call this distribution a “fiducial distribution for future values of /x.” 

In the same way that we constructed “risks” r{a | T) in Section 4.2. by 
weighting the losses L(a y fx) with a certain fiducial measure 1 0F(T; /x)/2jx| 
over Cl wc now construct “risks” r*{a\T) by weighting the losses L(a t ft) 
with 



8H(T; 6) 
80 


dO 


over Cl. Wc get 


f f L{a,p)g{n*6) 


8H(T ; 6) 


80 


dO dii. 


As before, the "risks” r*(a | T) arc “risks” conditional on T. Again wc apply 
the Bayes criterion in the generalized sense and choose a* e A such that 

r*(u* | T) = inf r*(a [ T). 

a c A 

The action a* again is called a T-optima! Bayes solution. 



90 


G. MENGES AND H. DIEHL 


Example (see [8]). In the above example we derived a N(x, 1 + l/«)-fiducial 
distribution for 9. Combining it in the described way with g(jx; 6) = 1 l(2v) l K 
exp[— lit 1 ~ #) 2 ] unknown), we get as fiducial distribution for future values 
of /x 


/ n \ i/2 

1 

r 

' 

\«+lj 

2tt . 

exp 

■’© 

- 

/ n 

\ 1/2 

1 


\2« + l 

) 

sj2n F 


~2(2n + 1) ’ 


dd 


which is the density of an N (x, 2 + l/«)-fiducial distribution over Q. With 
this distribution we obtain the “ risks ” 


r*(a | x) = 


L2tt(2«+1) 


' 1/2 . 

L{a, fi :) 


exp 


n(ji — x ) 2 
2{2n+l) 


dfi. 


5. CONCLUSION 

The applicability of the fiducial concept in statistical decision problems is, of 
course, limited to cases in which a fiducial distribution exists. Unfortunately, 
these cases make up only a small, though highly important, area of practical 
decisions. As we have seen, fiducial distributions for \i or 9 exist as a rule only if 
there is a sufficient statistic for the parameter under consideration and if / and 
g are continuous-distribution functions. Furthermore, in order to apply fiducial 
probabilities to statistical decisions in the way proposed in this article, a precise 
knowledge of the functions / and g is required. It is true that these conditions 
and requirements are serious limitations for practical applicability. The same 
cannot be said of an argument that has kindled a lively controversy in the litera- 
ture: that parameters like fi or 9 are by definition constants but that they arc 
treated as random variables in fiducial theory. They are, in fact, constants, but 
they receive the logical status of well-defined random variables after an obser- 
vation has been made. 

In view of the difficulties inherent in the fiducial argument and of its limi- 
tations, we might be inclined to discard it and to prefer the confidence concept. 
The theory of confidence intervals, however, does not yield probability state- 
ments on unknown quantities like [i or 9. Therefore it cannot supersede the 
fiducial argument. The likelihood function, on the other hand, can be useful when 
no prior distribution is known and the fiducial method does not work (see [8], 
also H. Diehl and D. A. Sprott, Die Likelihoodfunktion und ihre Verwendung 
beim statistischen SchluB, Statistische Hefte, 6 (1965), No. 2). The likelihood 
function is a weaker measure of uncertainty than fiducial probability, but it is 
less restricted in its applications. 



ON THE APPLICATION OF FIDUCIAL PROBABILITY TO STATISTICAL DECISIONS 91 


6. REFERENCES 

[1] R A r^HFR, “Irncrsc Probability/* Proc Carrbnd ^ Pml Soc , 26, 528-535 
(1930) 

[2] R A Fisher, “The Fiducial Argument in Statistical Inference/’ Am Eugenics, 
6, 39I-39S (1935) 

[3] DAS Fraser, “On the Definition of Fiducial Probability/’ Bull Jrt Stat. 
Irst , 40, S42-856 (1963) 

[4] G Mesces, The Adaptation of Decision Cntem and Application Patterns Proc. 
IFORS-Cor^r , Oslo, 1963, Pans, 585-594 (1964) 

[5] G Mences, Cbtr A\ ahrschcmlichheitsintcrprctationcn Statistischc Hejte, 6 (1965), 
Heft 2 

[6] G Mences, “On the * Baeesification ’ of the Mmimax Pnnciplc/’ Urlerrehrrens- 
forscJ ung, 10 (1966), Heft 2 

[7] D \ Sprott, “Necessary Restrictions for Distributions a Posteriori,” J Roy. 
Stat Soc , Ser B , 22, 312-318 (1960) 

[8] D A Sprott, “Fiducial Probability, Likelihood, and Decision Theory,” Um- 
\crsity of AA’atcrloo, 1965 (mimeographed <emmar report) 

[91 D A Sprott, “Statistical Estimation — Some Approaches and Controyersies 
Stalistische Jlefte, 6 (1965), Heft 2 

[10] A AAald, Statistical Decision Functions , New Yorl, London, 1950 


SUR L’APPLICATION DES PROBABELITES 
FEDUCIALES AUX DECISIONS STATISTIQUES 

RESUME 

Aprcs une brc\c exposition du modtle de decision statistique pris pour base, 
les auteurs examinent les dners types d’meertitude qui pernent } regner. II 
\ cn a deux sources: le caracterc aleatoire des ctats /x de la nature, et 1’ignorance 
du vrai ctat. Si la loi de probabihte 0 dans I’espace des ctats est connue, 
Pcsperancc des pertes subies en consequence d’une action sert a evalucr cctte 
action, criterc d’optimahte scion Ba\cs. II peut ctre axantageux de prendre un 
echantillon dans une population determmee par Petal actucl; alors la loi de 
probabihte a priori sera transformce dans unc loi a posteriori (par le theoreme de 
Bn}cs) et le cnterc de Ba}cs applique'. Si Ton ne connait pas 6, la loi de proba- 
bility on peut neanmoins sa\oir qu’ellc appartient a unc classe restreinte 0*, 
et les auteurs recommandent Papphqucr une combination des criteres Ba\es ct 
minimax. Ou bicn, on connait un ecbantillon pris scion unc loi dc probabihte 
(a ne pas confondre a\ec f* 1 ) qui depend de Petal inconnu de la nature. Cela 
peut mcncr, si ccrtaines conditions formclles sont satisfaites, a une loi de proba- 
bility sur Petat \i de la nature, ditc “fidudale” d'apres R. A. Tisher, ct qui 
expnmc d’une mamere objeeme combicn Ics ds\crs \alcurs possibles de /x 
sont cro\nblcs en \ertu dc Pinfonnation contcnue dans Pcchantillon. A\ec 
cette loi fiduciale on peut dc nou\eau appliquer Ic criterc* de Ba\cs. Une proce- 
dure* scmhlablc peut tire uuliscc pour obtemr unc loi dc probability fiduciale 
rur 0 9 shl faut craindre que Paction choice rcncontrcra une autre realisation de 
/i que celle cchantriionnec. 



92 


GEORGE E- MURRAY, JR., AND RICHARD D. SMALLWOOD 


MODEL INFERENCE IN ADAPTIVE PATTERN 
RECOGNITION 

Deduction a Partir d'un Modele dans la 
Reconnaissance Adaptive des Formes 

Geop.ge R. Murray, Jr., axd Richard D. Smallwood 

Institute in Engineering-Economic Systems 
Stanford University, California 
United Stales of America 

1. INTRODUCTION 



c 

<u ; 
n ctT' 

?* l n 



93 


Figure 1. Schematic diagram of adaptive pattern-recognition system. 







94 


GEORGE R. MURRAY, JR., AND RICHARD D. SMALLWOOD 

of 7] is left unspecified; it is a parameter of central interest in the work to follow. 
In view of the finite resolution possible in physical measurements, we believe 
that limiting the responses to a finite set is a reasonable representation of 
sensor capability. 

The primary advantage of this form of sensor model is that, as we shall see 
shortly, it permits us to define the space of all possible sensor models (of this 
form), to encode a state of uncertainty regarding sensor characteristics, and 
thereby to inquire into the value of training data and the behavior of an adap- 
tive system. 

The function of the components in Figure 1 becomes clear when we look at 
the process by which the system arrives at its final decision. This decision is 
simply a selection of one member of the set 0 for the given observed sensor 
response. The selection of the pattern 8i has consequences that depend on the 
actual state of the world Oj. We assume that it is possible to attach a cost or 
utility to the result of selecting Of when, in fact, the true state is Of, cy rep- 
resents this penalty. We further assume that the best selection of a pattern is 
the one for which the expected cost or utility of the decision 0( 

2.{0j\x,6}c { ) ( 2 ) 


is minimum. 

The symbol {8j\x, 6} in (2) represents the probability assigned to the event 
“ pattern is Off” when the knowledge at hand is the most recent sensor response 
a; and everything that tvent before is S. 

If the decision had to be made before the sensor response became known, the 
probability {8j\6} would replace {9j\x, £} in (2). This prior probability 
represents the state of knowledge before the sensor response became available 
and is based on everything known about the pattern-generation process and on 
the measurements made up to now; for example, a pattern-recognition device 
being presented letters in sequence from English text might have current prob- 
ability assignments profoundly influenced by the recent history of letters 
observed. In this article we do not consider the interesting problems associated 
with the development of models of the pattern-generation process and the 
assignment of prior probabilities. We avoid this problem by assuming that the 
box labeled “ Modeling of pattern generation process” in Figure 1 performs 
the heroic service of specifying {Oj \£}. 

Returning now to the problem of making a decision after the sensor response 
x is known, note that the definition of conditional probability requires that 


{Oj \x,6} = 


2{9i\d}{x\e u sr 


( 3 ) 


The decision problem is solved if {x\ 6j,S} can be found. This is the 
probability of the event “sensor response is x” conditioned on the presence 
of pattern Oj . If the sensor characteristics were known exactly, all the symbols 
used in (2) would be known directly, but when a pattern-recognition device 



MODEL INFERENCE IN ADAPTIVE PATTERN RECOGNITION 


95 


is first used, and perhaps long thereafter, the sensor characteristics are 
uncertain; so we turn now to the question of how knowledge about sensor 
characteristics is represented. 

The sensor characteristic for a pattern 6j is one point in a region defined by 

2^=1 (4) 

i 

of 7 )-dimensional space. Any point in this region may be regarded as a model 
of the sensor characteristic for pattern Oj. We have already assumed that one 
such model describes the sensor performance accurately; the other models arc 
imperfect representations of the actual sensor behavior. The region defined 
by (4) forms the space of possible models of the sensor characteristic for the 
pattern Oj. There will be a model space for each of the patterns in 0. 

We can represent our state of knowledge of the sensor characteristic for 
pattern Oj by a density function over the model space 

(5) 

An impulse density at one point corresponds to certainty regarding sensor 
characteristic, whereas a uniform density over the model space represents great 
vagueness regarding sensor characteristic. 

Using the concept of state of knowledge as a density function over model 
space, we may find the important term in (3) fromf 

{.v| Bj, <f} = £f{pU> | Oj, <5"}{.v | p(J>, Oj, 6} 

p</' 

- y pf{ P u) ! C) = E(pf) = p«>, (6) 

pU> 

since {.vj pU>, Ojj 6} is by definition just pW and knowledge of is not 
changed by knowledge of Oj. 

The results of (6), when used in conjunction with (3) and (2), utilize the 
state of knowledge contained in the system elements of Figure 1 to make the 
best choice of pattern. 

The role of the component “Modeling of sensor ” is thus to determine the 
value of £(pW) at each decision point. With no external data concerning 
the sensor's response to particular patterns, the value of E( pW) remains constant, 
but, if external training data are introduced (as in Figure 1), the knowledge of 
pW will improve. A system with this capability is adaptive; it is its properties 
that the remainder of this article explores. 

It should be noted that the adaptive property wc discuss involves learning 
about the sensor characteristic. It is also possible to improve our knowledge 
of the characteristics of the pattern-generation process by learning [2]. In a 
realistic situation, however, such as the occurrence of letters in English text, 
the pattern -generation process will not be accurately modeled by history- 
independent probabilities. For this reason it is extremely difficult to identify 

t The RcncrahVed summation operator is used to represent integration or summation 
(or both), depending on the nature of the operand. 



96 


GEORGE R. MURRAY, JR,, AND RICHARD D. SMALLWOOD 


the space of possible models for the pattern-generation process, to encode a 
state of information, and to develop a practical procedure for the use of training 
data. The work that follows assumes that knowledge of the pattern-generation 
process remains fixed. 

The properties of adaptive pattern recognition, as they are defined above 
are of fundamental concern in the design of any recognition device. Sensor 
design requires selecting the measurements to be made and defining the distin- 
guishable responses. An unlimited number of sensor designs are possible, and 
some are much better than others in distinguishing between patterns. The 
ease and speed of the design process depends on the ability of experimentation 
to determine the effectiveness of a particular design. An important question 
therefore is the following: 

1, What amount of training is required to achieve within a specified tolerance 
the performance potential of a system whose sensor characteristics are initially 
unknown} 

A different problem arises when only a limited body of training data is 
available; for example, if we seek to identify missile installations in a foreign 
country from aerial photographs, there may be only a small file of known 
examples on hand. An important question in this case concerns the optimum 
quantization of the sensor response. High quantization leads to the best per- 
formance if unlimited training is allowed but at the expense of a lower learning 
rate. Thus, when a limited opportunity for training is considered, a grosser 
structuring of the sensor response may be preferable. This leads to the second 
important question: 

2. What is the optimum sensor response quantization when there is a limited 
quantity of training data available} 


2 . THE VALUE OF TRAINING DATA 

We introduce now the concept of the expected value of a set of training data 
which will be useful in answering the two questions posed above. We begin 
by deriving a general expression for the expected value of a proposed training 
program. This step will help to clarify the concepts involved but yields a 
result that is too complex to be directly useful. Therefore, as the mathematical 
expressions become too cumbersome, we make the following assumptions in 
succession: 

1. There are only two patterns, 0i and 02- 

2. The prior distribution over the model space is uniform. 

3. The number of training data points is the same for both patterns. 

4. Before the introduction of training data the expected decision cost is 
independent of the pattern selected. 

We restrict attention here to training programs that present a sequence 
of input patterns and observe the sensor responses produced. A more general 



MODEL INFERENCE IN ADAPTIVE PATTERN RECOGNITION 


97 


form of training program might be considered but only at the expense of further 
complicating the expressions that follow. Because of the nature of the sensor 
and pattern generation models assumed, the data can be represented by 

(D = nil, «12, ...,J!i q ,J221, J!22i -"’*2 

where ;t// is the number of times that pattern 6{ produced a sensor response 
x~j. The total number of 0* patterns in the training program is 


A r i- 2 «i/- 

3 1 

Let D represent a set of data from such a training program. In the event 
that D occurs, the probability {Oj\x, D f 6} of the pattern 0j, given a sensor 
response .v, may be written as the generalized sum over the model space: 

{Oj\x, D,6}~Sf{0j,pv>\x,D, 6} 

p' )% 

„ {Ol I A *XpW 1 0J,D, 6}{x\ ptfl, 0 , , D, 6) 

~ P , {-v| D,6} ' 

where we have used Ba}es*s rule in proceeding from the first to the second 
line in (7). 

We shall consider each of the three expressions in the numerator of (7). 
The probability {Oj\D, 6} is the probability of occurrence for the jth pattern 
based on knowledge of the training data D. Since the training data provide no 
additional information on the occurrence of the patterns, 


{0/1 A <O={0/K}- 


( 8 ) 


The second probability in the numerator of (7), (pO>) Oj, D, €} f is the 
posterior distribution of the sensor model for the jth pattern. The third prob- 
ability in the numerator {x j p<B, 0j y D, 6) is the probability of an output 
measurement x if the pattern Oj and measurement model arc known; by 
definition it is simply Thus (7) becomes 


{Oj J*. D, = 


{.v | D, 6 } ’ 


(9) 


where jfj^ is the posterior mean of the .vth component of the vector pO): 

& = ST p ( J\p h) 1 0, , A 6). (10) 

P JX 


The results (6) and (10) arc the same, except that (10) includes knowledge of the 
data D. The posterior probabilities of the patterns in (9) can now be used to 
make the decision that will have the minimum expected cost. The expected 
cost of deciding on pattern 0( when the measurement .v has been observed is 

C { (x, D)^ Cil {0j\x, D,t} (11) 



98 GEORGE R. MURRAY, JR., AND RICHARD D. SMALLWOOD 


and the expected cost of the optimum decision is 

C(x, D) =s min {Oj \ x > D, £ 
i i 

1 


{x\D, £) m, ; n ? I S ' jCi ^’ 


( 12 ) 


where we have used the expression for {flj ! x, D, £ } in (9). 

The expected value of C(x, D) with respect to the measurement x is 

C(D) = y{x \D,£)C(x,D) 


= ££{x\ D, £} min 2 c {} {0 , | *, D, £} 

X i J 

= & min 2 CijiOj | (13) 

z i j 

where we have used (12) to derive the last line of (13). This expression repre- 
sents the expected cost of the next decision after a specific set of training data 
D has been observed but before observing the next sensor output. It is worth 
noting that the effect of the training data D on the decision cost C(D) is com- 
pletely localized in the change of the posterior expected value of the sensor 
model jfj). 

Now the value of the data D is defined as the decrease in the expected 
decision cost: 

V(D) = C 0 -C(D), (14) 

where Co is the expected decision cost when no training data are to be used, 
that is, when we use the prior expected value of in (13). 

Finally, we can write the expected value of a proposed training program 
by computing the expected value of V(D) over all possible data sets: 

se {D\S} V(D) =5 Co {D\ 6} C(D ) 

D I) 

= C 0 -££{D\£}S£ min 2 c ij {0 j 1 £}$>. (15) 

1) x i j 

In order to circumvent the computational problems inherent in (15), it 
is convenient now to limit further discussion to the two-pattern case. We may 
also assume, with no loss in generality, that cu = c<i% = 0, that is, that the cost 
of a correct classification is zero. With the aid of this assumption, the expression 
for C(x, D) in (12) becomes 

C(x, D) = 7^2-^ min [c n {0 2 \6'}pf, e n {0 , | £}&]■ 


A more conventional form is 

C( *' D ^~{x\D, 6} 

where 


$)pf if r{x)>y 

Hi{0i \ if K*)<yJ 


P? , c l2 {0 2 \6) 

r(x) -p‘*> and V ~co X {0 l \£Y 


(16) 

( 17 ) 



MODEL INFERENCE IN ADAPTIVE PATTERN RECOGNITION 


99 


and we have assumed that C21 >0. The top expression in (16) corresponds to 
deciding on pattern 0 \ ; the bottom one, to deciding on Oz . This is the classical 
likelihood ratio test, but one in which the value of the likelihood ratio r(.x) 
depends on the training data D [see (10)]. We can then use (13) and (16) to 
write the expected cost C(D) of the next decision: 

C(D)= l e lz {0»\6W + I eM6}p™ 

all* 3 allxB 

rOr)>*/ t{t)Cv 


— c 2l{^l 



2 rV+ I p V 


nIJ j 3 all x 3 

r(x)>y Jix)<y 


(IS) 


To proceed beyond (IS) requires a specific prior state of knowledge regard- 
ing sensor characteristics. We assume now that for each pattern the prior 
distribution over the sensor model space is uniform; that is, 


(P ll) I <0 = (P <2) I <?} = 0? — *)! (19) 

With this assumption in (19) the prior expected values of the/^’s are all equal 
to t ; -1 and the expected cost of a decision in the absence of any training data 
can be calculated from (18) as 

Co = C2i{0i j 6} min (1, y). (20) 


The expression in (IS) for the expected cost of a decision C(D) requires the 
posterior expected value of p\ x \ In Appendix I we prove that if the prior 
distribution over is uniform and there are N{ training measurements from 
pattern 6i , tit) of which are equal to j , then 


/f 


ny 4- 1 

N{ -f T) ' 


( 21 ) 


Thus from (17) and (18) we can write 


C(D) •= cz 



^ «2 i-j-1 , ,-.nix+I\ 

-Xl TV)]’ 


( 22 ) 


where the first summation is over all x such that 

wix-f l ^ _ A T i + y 
n 2 j -J- 1 > ^ A r o -f ^ 

and the second summation is over all x such that 

71\ t + 1 jVj -f- Tj 

nzx 4 1 1 A 7 2 4* T] 


(23) 


(24) 


The value of a set of training data V{D) is the difference between Co and 
C(D) in (20) and (22). Now let Nz) be the expected value of a training 

program consisting of A’i measurements of 61 and A 7 2 measurements of 0z. 
To evaluate this quantity from (15) we need the probability {D\ 6} of observing 



100 


GEORGE H. MURRAY, JR., AND RICHARD D. SMALLWOOD 


a set of data during the training program. In Appendix I we prove that for 
data points from 6 \ and Nz from 62 the probability of a set of data 

D = {nu, ni2 , - - * , ni v , nzz , . . . , nzri) 

is uniform over the space of all possible outcomes; that is, 

[(nrT’irT- w 

Thus the expected value of these data is 


V(Ni, N z ) = c 2 i{0i 1 6} [min (1, y) — ^ ^ *) 


X 


a l r2 -iV2+7j • z n 1 ^ v ) 


(26) 


where the summations extend over the range of x defined by (23) and (24), 
respectively. 

The expression in (26) is quite complex, but it can be simplified consider- 
ably by introducing the final two assumptions. The number of training data 
is the same for both patterns. 


and 


A r i — Nz = A 7 


(27) 


£12(02 [ 6 } 

y ~c 21 {d 1 \S}~ 1 - 


(28) 


Reference to the expressions in (16) and (17) shows that (28) is equivalent to 
the assumption of indifference between the two decision alternatives before 
receiving any training data. 

The expression in (20) becomes 

C 0 = C 2 i{Bi \£} 9 (29) 

and so 


V(N, N) 


Co 


/N4-7)-l\ 2 
[ * ) 
C 0 


all 

(n 2l ,...,n 2n ) 


{' 


i 


(N+v) 


I (t*z+l)-r I + 

all x s all x 3 


n, > n t3 


N — 2 min (m x , n Zx ) 
a 11 
x 


(30) 


r + iv i )v+7)o ta .^.^i 1 

In Appendix II we derive a simpler expression for V(N,N): 

X (\ + l T 2 )^')’ < 3I > 

l * J 



MODEL INFERENCE IN ADAPTIVE PATTERN RECOGNITION 


101 



and this can be reduced to a recursion relation: 


F(A T , A T ) = 


A 7 2 

(A' + 77 X^ 4 - 17 - 1 ) 


F(A r — 1, A r — 1) + 


CoN(i) — 1) 

(A r + i7)(A r + 77 1 ) 


for A r > 1. (32) 


Of course, the expected value of no data is zero: F(0>0) — 0- We also prove 
in Appendix II that for large amounts of training data 

as A T -> co. (33) 

The recursion relation in (32) was used to calculate F(A r , N) for values of 
Tj from 2 to 50. Figure 2 portrays some of these results graphically; specifically, 
we have plotted F(A r , A 7 ) for rj = 2, 3, 10, and 50. The dashed curve is discussed 
in the next section. 

The asymptotic behavior of F(iY, A r ) in (33) is especially interesting; it 
implies that for a uniform prior over the model space a large amount of train- 
ing is expected to produce an average improvement of 33 to 50 percent in the 
operating costs of the pattern-recognition system. A look at this variation in 
asymptotic improvement as a function of the measurement quantization rj 
shows an expected eventual improvement of 33 percent for t)~ 2, whereas a 
very' large value of tj increases this expected improvement to only 50 percent. 
This result has the intuitively satisfying interpretation that the expected per- 
formance of the pattern-recognition system improves as the complexity of the 
sensor increases. However, as the curves in Figure 2 show, more training data 
arc required to achieve this expected improvement in performance as the sensor 
complexity increases. In fact for values of 7j up to 50 approximately 2rj data 
points arc required to produce SO percent of the improvement expected from 
infinite training. 




102 


GEORGE R. MURRAY, JR., AND RICHARD D. SMALLWOOD 


3. OPTIMUM QUANTIZATION 

We next examine the question posed earlier: what expected improvement can 
we achieve with a limited amount of training data? 

The expected value of a single training datum from (31) is 

nui) ~ Co ^TTy (34 > 

This quantity decreases as the quantization increases. Thus we have the interest- 
ing result that for a single training datum increasing the quantization can 
increase the expected cost of operating the pattern classifier. 

The curves of V(N , N ) in Figure 2 indicate that for each value of N there 
will be some value of 77 for which V(N } N)> the expected value of the data, is 
maximum. In other words, for a fixed amount of training data there is a degree 
of quantization that will minimize the expected posterior cost of operating the 
system. This optimum value of 77 is plotted in Figure 3 for values of N from 1 to 
1000. The corresponding maximum value of V(N , N)> 

V m&x (KN)=m*xV(N,N), (35) 

is the dashed curve in Figure 2. As an example of the use of Figures 2 and 3, 
an expected 40 percent reduction in operating costs can be achieved with only 
22 training samples per pattern if the quantization 77 is set equal to 6. Increasing 
the quantization to 50 will decrease the expected value of the data by 42 percent. 



Figure 3. Optimum quantization for a fixed training program. 




MODEL INFERENCE IN ADAPTIVE PATTERN RECOGNITION 


103 


4. SUMMARY AND CONCLUSIONS 

This article considers some of the fundamental adaptive properties of a pattern- 
recognition system. The assumed objective of the pattern selection is to mini- 
mire the expected cost of the classification decision. This statement of the 
objective requires that the probability of each pattern, conditional on the 
information at that time, be known. \Ye suppose that this required probability 
is constructed from: (a) a prior probability of pattern presence; (b) the response 
of a sensing device; (c) current knowledge of the response characteristics of the 
sensor. Wc further assume that the current state of knowledge of the sensor 
characteristics is uncertain. By limiting the sensor to a finite number 77 of pos- 
sible responses and b\ modeling the sensor response as a multihominal trial it 
is possible to define the space of all sensor models as a simple region in 
77-dimensional space. The state of knowledge of sensor characteristics can 
be encoded as a densit\ function over the model space. 

Adaptive pattern recognition can be viewed as the process of using training 
data to improve the state of knowledge regarding the sensor performance 
characteristics and the pattern-generation process. We have concentrated in 
this paper on the adaptation achieved through inference of sensor charac- 
teristics. 

The expected value of training data was derived under the condition that 
there arc only two patterns and the initial knowledge of sensor characteristics 
is represented by a uniform density function over the model space. This result 
was used to show that (a) for values of 7? less than 50 approximately training 
points arc expected to develop SO percent of the performance potential of the 
svstem; (b) if only a limited bodv of training data* is available, there is some 
finite value of 77 for which the expected cost of operating the trained system is 
minimum — increasing 77 beyond this value will increase this expected cost. 
A graph of the optimum value of 77 for specific quantities of training data is 
plotted in Figure 3 . 

These results are based on the assumption of a uniform prior over the model 
space and a set of costs and prior pattern probabilities such that each choice 
has the same initial expected cost. This suggests that the calculated quantities 
of training data required to reach a specified level of performance arc likely to 
be larger than those needed in practice. 

An important aspect of this study is the departure from the usual continuous 
sensor models; by limiting the argument to a finite-valued sensor response we 
were able to discuss simultaneously the effect of sensor quantization and train- 
ing data on the adaptive properties of the system. 

5. REFERENCES 

[1] N Arravsov and D Brvvermvn, “Learning to Rccogmrc Patterns in a Random 
Environment/* Stanford Electronics Laboratories Report, SEL-62-071, Ma\ 1962. 

[2] D J BitvvTnvvs, “Machine Learning and Automatic Pattern Recognition,*’ 
Stanford Electronics Lihoratonc* Report, Tech Report No 2003-1, February 1961 

[3] W Eei ixr, An Introduction to Probability 77 eery erd its Applications , Vol I, Wilev, 
New York, 1930, p 36 



104 GEORGE R. MURRAY, JR., AND RICHARD D. SMALLWOOD 

[4] D. G. Keehn, “ Learning the Mean Vector and Covariance Matrix of Gaussian 
Signals in Pattern Recognition/’ Stanford Electronics Laboratories Report, SEL- 
62-155, February 1963. 

[5] T. Marjll and D. M, Green, “Statistical Recognition Functions and the Design 
of Pattern Recognizers,” IRE Trans. Electro?iic Computers , EC-9, No. 4, 472-477 
(December 1960). 

[6] T. Marill and D. M. Green, “On the Effectiveness of Receptors in Recognition 
Systems,” IEEE Trans. Inform. Theory , IT -9, No. 1, 11-17 (January 1963). 

[7] H. J. Scudder, “Probability of Error of Some Adaptive Pattern-Recognition 
Machines,” IEEE Trans. Inform. Theory , 363-371 (July 1965). 

[8] E. A. Silver, “Markovian Decision Processes with Uncertain Transition Prob- 
abilities or Rewards,” Appendix B, Tech. Rept. No. 1, Contract Nonr-1 841(87), 
Operations Research Center, M.I.T., August 1963. 


APPENDIX I 

One of the main assumptions in Section 2 is that the prior distribution over the 
model space for each pattern is uniform; that is, 

{P W) K} = 6?-1)! (36) 

In this appendix we calculate for this uniform prior the posterior expected 
value of the probability in (10): 

Px ) =& > Px ) {P (j) \D, 3), (37) 

p<i» 


where the training data have the form 

D = ttj2 > • • • j Mjri)* ( 38 ) 

Because of the assumed independence between the training data from different 
patterns, we can limit ourselves to the consideration of data for a single pattern 
with no loss in generality. 

For a given set of data D the posterior distribution of the sensor model is 

)p ,„i ( 3„ 

The second term in the numerator of (39) is just the probability of the set of 
data when the model pO) is known: 


PIp <*>,#} = 


Njl 


nji\ n j2 \ 


Tin nmv 

1 n JV l 1 


and so the posterior distribution of the sensor model is 


( 40 ) 



MODEL INFERENCE IN ADAPTIVE PATTERN RECOGNITION 105 


where h is a normalizing constant that can be found by forcing the joint density 
function in (41) to have unit volumef 


(jVj+T7-l)t 


(42) 


The expected value of f/j* can then be calculated: 


& = * f t>? n u>\‘> ¥i j) • • • dp? 

J p' h i 


?ij 1 1 w;o! • • ■ ! (tf/x + 1)! 


i ... i 




(Nj + rj)l 



(43) 


We now calculate the probability {D \6} of a particular set of data Z> = 
[nju tijo , . * . , ttjii] if the prior distribution over the model space is uniform. 
The application of elementary probability operations yields 


{D\S}~ £T{D, p (i) 1 6} - \6}{D\ p</>, £), (44) 

p <J) p(» 

and substituting the relations in (36) and (40) produces the result that the 
probability {D\6} is uniform over all possible data sets of the form in (38): 


W)- 


{ V -\)\N,\ 




71 j I ! 71 jo 

(v-l)lW 


if n (p ( »y.>dp?"-dp<> 

' V J «-t 

(77 — 1)! Njl 


,0) 


hi] 1 1 71 jo] 


! n ln ! 


(AV-f-,-1) 


.-("'TT' <«> 


and the expression for k in (42) was used in the derivation of (45). 

(Nj + V-V 


The quantity 




represents all possible ways in which the Nj 


data measurements can be distributed among the 1 9 possible sensor outputs. 


APPENDIX II 

The expression for the expected value of a training program in (30) is quite 
complex. This appendix presents the derivation of a simpler expression for 

m, a 7 ). 

It is convenient to consider only the last expression in (30). Thus we define 
A(A r )~ T 2 min (« u , 7 / 2 ^). (46) 

(«J, ftH X 

(n rl 

+ For a more detailed discussion of this multivariate beta distribution the render is 
referred to [8). 



106 


GEORGE R. MURRAY, JR-, AND RICHARD D. SMALLWOOD 

The first summation in (46) requires the consideration of all possible sets of 
training data that might result from the training program. Toward this end, 
Figure 4 illustrates the space of possible data sets for the first pattern: 



Figure 4. The space of possible data set s. 


Each column in this array represents a possible data set and so will have the 
sum of its elements equal to N. 

Of course, there will also be an identical array for the space of all data sets 
for pattern 02 ; and the space of data sets for both patterns can be visualized as 
2 

possible combinations of the columns of the two individual 

arrays taken two at a time. The evaluation of the double summation in (46) can 
be accomplished by comparing each column of the array in Figure 4 with each 
column in the array for pattern 02 and writing down a third column that has 
each of its elements equal to the minimum of the two elements under comparison. 

( jy * ju jj \ \ 2 

1 j third columns is A (N). 

Since the arrays for the two patterns are identical, the double summation 
can also be evaluated by comparing each element of the array in Figure 4 with 
every member of its row and summing the minimum of all these comparisons. 
Furthermore, since each row of this array will contain the same collection of 
numbers, we need only carry out these comparisons for one row and then 
multiply the result by tj. 

It is convenient, then, to consider all the entries in any row that are equal 
to L Let ki equal the number of such entries in each row. If we compare any 
one of these entries with all the members of the row and sum the minimum of 
these comparisons, we obtain a sum equal to 

2 mkm +/ 2 Am, 

m — 0 m*=t— 1 



( 47 ) 






MODEL INFERENCE IN ADAPTIVE PATTERN RECOGNITION 


107 


where the first summation corresponds to those comparisons with entries whose 
values are less than or equal to / and the second summation with entries whose 
values are greater than L There are kt such comparisons to he carried out for 
each row and there are 77 rows; the result of summing over all the entries in 
Figure 4 is 

A(A') — t] y k t y nk m -f rj 2 ih 2 k m- m 

0 n - 0 l - 0 n-M 

Adding and subtracting the sum 

N-l 

■q 2 *1 2 

rj«/~ I 


from the right side of (48) gives 

A(A') = 77 I A, i mkn — Tj 'j? hi i (m — l)k m . (49) 

771^0 J--0 rr-/-l 


To proceed further it is necessary to know the value of /:/. In Appendix I 
wc found that the number of ways to distribute the N data measurements 

among the 77 possible sensor outputs is (^ * f y ^ V In the language of com- 


binations this is the number of ways in which N indistinguishable balls can 
he placed in 77 distinguishable urns (see [3]), If there are / of the balls in one 
urn, then k is just the number of ways in which the remaining (iV — /) balls 
can be distributed among the (77 — 1) remaining ums. Thus 


*.=rw!- 2 )=rwn- 


(50) 


Since h\ is also the number of entries in each row of Figure 4 that are equal 
to /, the first summation in (49) just represents the total number of entries in 
each row of Figure 4; and so 



(51) 


The second summation in (49) represents the sum of aJl the entries in 
each row, but each row in Figure 4 contains the same set of numbers: and 
so the sum of the entries in each row will just equal the sum of all the entries 
in the entire array divided by the number of rows 77 . Furthermore, the sum 


of the entries in each of the 


rr) 


columns is X; 


thus the second 


summation in (49) can be written as 


A’ 


I m/: n 


.V 

r: 0 




(52) 



10S GEORGE R. MURRAY, JR., AND RICHARD D. SMALLWOOD 

Finally, the last summation in (49) can, by a change of variable, be re- 
written as 


y x-i y-t 

2 (m — l)k m = 2 ik, +i 

t~l 


X-l i 

2 

i~l \ 


N — / + 7j — i — 2 
7}-2 


)• 


(53) 


This final expression in (53) is identical to (52), with A r replaced by (JV — /); 
hence 




\ N ' 


ry 


(54) 


Substitution of the results of (50), (51), (52), and (54) into (49) yields 

4<w)_w( w+ ’- 1 ) , -'|)(jv-o( w+ ;r 2 ,_2 )( w + , 'ri'" 1 )- 

(55) 

This result can now be used in the original expression for V(N, N) in (30): 

V(N N) 9 l 1 ( X + 17 — / — 2\ /iv + — 1\ 

fN+i-iy£o{N+v)\ V-2 ){ v-1 )’ 

(56) 

which, after a change of variables, can be written as 

< 57 > 


Use of the identity 


CMS)U.) 


(58) 


reduces (57) somewhat further to 
C, 


w*)- I, (59) 


A r + 7j-l\ 
TV ) 

which is the same as (31). 


( 



MODEL INFERENCE IN ADAPTIVE PATTERN RECOGNITION 


109 


The expression in (59) can also be written as 
V(N, N) 


Co 

(v-y\- 

y /'•-’? -2 


■i-v- 


A \ ’i-i 

)l v-i } 


CoN(-q — 1 ) 

'(N + jKN + ij- 1 ) 


N 2 


V(N — 1, A f — 1) 


C„N( V — ]) 


jV> 1, 


(60) 


am! this is the recursion relation in (32). 

We now prove that V(N, N) converges to an asymptotic form for large N: 


V{N, N) C, 




as A 7 — ^ so. 


(61) 


The plan of attack is to prove that the difference between F(A 7 , A r ) and the right 
side of (61) approaches zero for large A 7 . Therefore we define the difference 


S(N) = C( 




V(N, N). 


(62) 


Substitution of this definition into (60) produces the recursion relation 

A ;2 8(iY — 1) , T]tc 


8(iV): 


(;V ij)(A r -l V — 1 ) (A T -f- rj)(A' -fij— 1) 


for A'> 1, (63) 


where 


(*?-!) 
(2ij — 1)’ 


The method for proving the convergence of F(A 7 , A 7 ) in (61) is to prove 
that 


w 

0 <5(W)< — . (65) 

This will establish that S(A ; )-^0 as N~>co. Wc proceed by induction: for 
A*~l and ^>2| the relations in (62) and (63) yield 


*(!)*« 


^-lX^ + l)* 


( 66 ) 


This quantity is greater than zero, as are r c and the coefficients of S(A r — 1) 
and t v in (63). Thus it is clear from (63) and (66) that d(A') will be greater than 
zero for all A'; and so the left side of (65) is established. 

f The case 77 I is unintcrcstintr. for F(iY, A*) is identically zero for all X; that is, 
the sensor is useless if its output is always the same. 



110 


GEORGE R. MURRAY, JR., AND RICHARD D. SMALLWOOD 


For 7 ] >2 (66) yields the inequality 



1 + 


(2 — 77) 


(V + V(v ~ !)J 


w w 

rj 2 


( 67 ) 


and now we prove that if the right side of (65) is true for 8(N — 1) it is also 
true for 8(JV). Assuming that 8(N — l)<w/iV, we have 


S(N)< 


Nzu 

(N + rjjiN + yf-l) 


TjW ZV ZV 

(N + 7])(N + 7J — 1) N + rj — 1 ~ A + 1* 


( 68 ) 

Thus the inequality in (65) is proved; and this proves the convergence of 
V(N , N ) to the asymptotic form of (61). 

It is worth noting in passing that the inequality in (65) can also be used to 
find a bound on V(N , N). In fact, for rj — 2 the right side of (65) is equal to 

w 


DEDUCTION A PARTIR D’UN MODELE DANS LA 
RECONNAISSANCE ADAPTIVE DES FORMES 


RESUME 

Dans cet article, plusieurs proprietes fondamentales d’un systeme adaptatif 
de reconnaissance des formes sont deduites du modele des caracteristiques des 
organes sensoriels. On considere un type precis de systeme dans lequel: (1) 
un nombre fini de formes sont susceptibles de se presenter; (2) tout organe 
sensoriel est sensible a certaines donnees physiques de la forme; (3) un organe 
de decision choisit la forme qui minimize un cout moyen. Cette decision est 
prise en considerant (4) le processus de creation des formes et (5) les caracter- 
istiques actuelles de Torgane sensoriel. 

En limitant les reponses possibles de Torgane sensoriel a un nombre fini 
et en supposant multinomiel le modele de mesure du processus, il est possible 
de definir un espace complet de modeles, de coder Tincertitude concernant les 
caracteristiques de Torgane sensoriel, et d’envisager les effets qu’aurait un 
programme d’apprentissage donne. On porte attention a deux problemes 
majeurs: (1) quelle est la quantite de donnees necessaire a Tapprentissage pour 
atteindre, dans des normes de tolerance fixees, les performances potentielles 
du systeme? (2) quelle est la quantification optimale de la reponse de Torgane 
sensoriel, quand on dispose d’une quantite limitee de donnees d’apprentissage? 

On prouve dans un cas particulier que, pour environ 2r] exercises d’apprentis- 
sage, l’esperance mathematique de l’indice de performance du systeme est 80% 
de sa valeur potentielle. On determine une valeur optimale de rj en fonction 
de la quantite de donnees d’apprentissage disponible. 



A THEORY OF THE CONTROL OF THE COGNITIVE PROCESSES 


111 


A THEORY OF THE CONTROL 
OF THE COGNITIVE PROCESSES 

Une Theorie stir le ContrSIe 
des Moyens de Perception 

Nicholas M. Smith 

Research Analysis Corporation , McLean , Virginia 
United States of America 

1. INTRODUCTION 

The objectives of this paper are to assemble the directives for decision (which 
we call “cognitive controls ”) to determine the characteristics of rational response 
imposed by these constraining directives and to illuminate those factors that 
delimit the rational act and that constitute the edge of reason. 

As a primitive commitment, this work accepts undecidability as an essential 
feature of the rational process, essential in the sense that open-ended, in- 
definitely extended rationality can never achieve unconditional decidability [1]. 
This docs not preclude the possibility of establishing decidability in a closed 
finite rational system. The establishment of such relative decidability by attain- 
ment of rational concepts and models for the world of experience so-far-forth 
is not ruled out by the major commitment. To a philosophy so committed we 
give the general name “conceptual relativism” and to the kind of decidability 
that can be established in finite systems, the term “ relative decidability.” 

Although modern science has embraced [2] this doctrine in conceptualization 
and prediction with respect to substantive objects, its extension into the act of 
valuation is at odds with the historical trend in Western culture for the last 25 
centuries. We have constructed a schism between the world of objects and the 
world of values; we have permitted the scientist to accept relativism in objective 
theories, whereas the ethical philosopher and the theologian have labored under 
illegitimate demands for absolute knowledge of values. Although relativism in 
value concepts as well as in substantive concepts seems repugnant to many, this 
condition is a necessary concomitant of cognitive freedom. A world in which 
all cognitive acts are reduced to principles that arc immutable and complete is 
reduced to mere algorithm, that is, a set of directives leading deterministically 
to decision. It is not knowledge of (absolute) truth but undecidability that is 
the mark of man's freedom. 

1.1. Freedom and Control; Hierarchical Structure 

Undecidability at any level of cognition indicates existence of freedom at 
that level. As a scientist may say of a particular problem, a degree of freedom 
exists. He may then elucidate a more general principle which absorbs the 
freedom and decides the point in question. 



112 


NICHOLAS M. SMITH 


For example, a physicist in developing a model of the motion of a particle 
moving in a gravitational field may first formalize a space that has three independ- 
ent degrees of freedom. He then may write the deterministic equation of motion 
or, in differential form, the “laws” of motion; or he may state Hamilton’s 
principle of least action. In contrast with deterministic formulations, the 
application of a normative principle (e.g., least action) forecloses freedom by 
means of a unique selective process [3]. In an analogous manner the cognitive 
controls, as general principles, operate at the meta- level to select admissible 
object theories, models, and constructs. 

The points to be made here are (a) that the application of principle beyond 
principle sets up a hierarchical systemic structure, each layer being composed 
of more abstract and general principles than those occurring in lower echelons 
of the structure, and (b) that there is a type of freedom at each level that is 
consumed (i.e., a selection made from within the alternatives permitted by the 
range and degrees of freedom) by a principle acting as an operator for decision 
at the next immediately higher region. This characteristic sets up the essential 
systemic nature of all conceptualization . Hence “system” is not a word that 
applies to some objects or concepts and not to others: it is a cognitive para- 
digm applicable literally to all things [2], 

We anticipate the conclusions of this study by mentioning here that this 
systemic structure is composed of a finite number of layers (levels of partition) 
having an atomic, or lowest, level and a supreme level and that each layer is 
partitioned into a finite number of parts. 

1.2. Freedom and Control: Operand and Operator 

Since the decidability 7 operators (principles) at each level are more general 
than those that operate at lower levels, they are also fewer in number. The 
following additional observations can be made: 

1. This stratification of systemic hierarchical levels determines a natural 
taxonomy of systems [3] providing broad categories, that is, objects, meta- 
principles, and supreme principles. This situation in cognition is analogous 
to the structure of courts of law. There are the trial courts which are concerned 
with the objective details of the case, the appeals courts which are concerned 
with legal jurisdiction and procedure of the trial courts, and a supreme court 
which defines the law' according to a set of agreed upon acts and a priori con- 
stitutional principles admitted in advance of trial court action. 

2. The taxonomic structure is limited to a finite hierarchical range by one 
of the meta-principles (testability). 

3. The observation that principles at higher levels are relatively more general 
(i.e., pertain to a wider class and more inclosure of constructs) leads to the 
conclusion that the number of principles decreases as the level of consideration 
is raised. The number at the supreme level is a basic minimum. If it were not 
for this triangular structure, the process of abstraction and concrescence (grow- 
ing together into an enlarged whole; symthesis [4]) w'ould not introduce any 
simplification in the cognitive process and rationality would be reduced to 
collections of ad hoc rules -of- thumb. 



A THEORY OF THE CONTROL OF THE COGNITIVE PROCESSES 113 

4. The principles that accomplish decidability at the meta-level arc extra- 
logical. Thus the approach of this article will be extra- formal. 

J.3- The Finite Cognitive Agent 

Cognition is a process that is carried out in a “command and control” 
organism (including mechanism, programmed computer, cybernetic device, 
individual member of a biological species, organizations, and culture). This 
organism is referred to as the “subject” or as the “ cognitive agent.” Because 
of the hierarchical structure of conceptualization, the referent of these terms 
must he relative to the specifics of the problematic situation: what organism, 
what cognitive model, what level in that cognitive model? When such terms 
are unqualified, as when the context is clear, the relative conditions are not 
specified; however, they are always present and modify or constrain the discus- 
sion. The specifics of the following remarks refer to the whole organism and to 
its entire set of cognitive models, sensors, and motors, facing an environment 
of other organisms, and so on. 

We are committed to the following general restraint: every cognitive agent 
is finite; it has finite capacity for information storage and retrieval; it has finite 
capacity for semiotic processing; it carries out its function of processing and 
intercommunication at finite rates; it communicates with its environment at 
finite rates of information transmitted and in terms of finite strings of symbols. 
These constraints arc referred to as the “ cybernetic limitations” of the cognitive 
agent. The cybernetic limitations are characteristics of the organism; that is, 
they are species-specific . 

A concomitant of this commitment is the essential discrete (noncontinuous) 
character of the semiotic states representable in the cognitive agent. (Wc may 
note that even an analog computer is essentially discrete, since its dial settings 
and meter readings can be determined with only finite resolution.) The char- 
acteristic of discreteness of all substantive things will emerge from one of the 
cognitive controls (testability) determined by the finiteness of the cognitive 
agent. 

The consequence of this characteristic is severe: the semiotic stales in the 
cognitive agent being finite (and discrete) limit cognitive models similarly , that is, 
to models that arc finite and discrete , 

Formal theories utilizing concepts of continuity and/or of infinities are all 
finite theories . The logician and mathematician introduce such concepts in 
terms of indefinitely extended operators, which are designated by discrete 
symbols. He bears the burden of demonstrating that the (finite) operations that 
he executes with these systems arc legitimate (i.e., they arc admissible symbolic 
operations in a finite cybernetic device). 


2 . PRIMITIVE COMMITMENTS OF CONCEPTUAL RELATIVISM 

Conceptual relativism, as a philosophical position, constitutes a perspective in 
which the cognitive act is construed as an interaction between subject and object. 
Relativism is the bare denial of absolutism. It features characteristics developed 



114 


NICHOLAS M. ?>IITH 


from the perspective above: (a) an indefinite object (no unconditional decid- 
ability) and (b) 2 finite subject (cognitive agent) (no unconditional freedom;. 
Relativism admits a middle ground: conditional decidability and conditional 
freedom. 

2.1. The Indefinite Object: No Unconditional Decidability 

Conditional decidability is characterized by a collection of explicit principles 
that may be described as “modal/’ “ epistemological/’ and “ ontological” 

2.1.1. Modal Characteristic 

The modal characteristic has already been discussed in some detail The 
consequence of the hierarchical nature of principle and freedom leads to 2 
systemic structure that is characteristic of literally all concepts. We have 
previously [3] demonstrated that a taxonomy of systems resulted in an extension 
of the traditional morphological taxonomy downward to include all inanimate 
things and upward to include all symbolic (or semiotic) entities differentiated 
according to a general concept of “emergence.” 

“System” is characterized by the follovring connotations: (a) partition 
(into parts), (b) a number of hierarchical layers, (c) a homeostasis of certain 
internal properties held within limits over a wide range of environmental con- 
ditions as determined by (d) norms associated with each partition 2 t each level 

Conceptual systems are extended by emergence. “ Emergence” is a creative 
semiotic act of the cognizing 2 gent which may add a new hierarchical level at 
the top by the synthesis (concrescence) of components into an enlarged whole 
or may add a new hierarchical layer at the bottom by a partitioning of the 
elements (schism) to form a nevr elemental level. 

2.1.2. Epistemological Characteristic 

The epistemological characteristics originate from a clarification of the 
nature of the meta-principles used to restore decidability in the face of freedom 
engendered by the relativity of object-constructs. The presence of degrees of 
freedom admits of corresponding kinds or sources of undecidability. Each kind 
of undecidability is resolved by an appropriate principle or meta-principle. 

To assert that a statement is “true” is simply to assert that the statement 
is admissible zeith respect to a principle that resolves a specific kind of undecidability 
at a subordinate level of decision. These meta-principles for which we search in 
order to establish relative decidability' are principles that (a) rule out concepts, 
sentences, or models which have intrinsic sources of undecidability (ambiguity) 
as categorically inadmissible , (b) set up a threshold criterion with respect to a 
norm, (c) enable a unique selection from values distributed in a range of freedom, 
or (d) select the action that optimizes future freedom. The truth-value with 
respect to (a) is two-valued, that is, the norm is either satisfied or not satisfied, 
or (b) it is measured on a continuum we refer to as a “warrant.” The decision 
to admit or not in (b) depends on the measure of warrant; that is. the warrant 
deficient with respect to a norm or is sufficient or better. There is a functional 
truth-value associated with (c), 2 s the measure of freedom is considered an 



A Til TORY Of THE CONTROL Of THE COGNITIVE PROCESSES 


115 


“adjustable parameter ” and the cognitive principle is an operator that selects 
among the degrees of freedom. The last case (d) constitutes the resolution of 
evolutionary undecidability. There are thus several kinds of truth and measures 
of truth. 

A concomitant characteristic is the requirement that a statement be testable 
in principle ; otherwise, a new kind of undecidability is introduced and is ruled 
out by the simple and effective device of declaring it inadmissible under an 
additional cognitive control. Testability involves an interaction, since only 
through interactions may statements be tested. Thus the principles we seek 
arc in the nature of operators that operate on a function defined in the space of 
freedom . We seek here a general definition. We may symbolize this as 

A = O p [ e (J)] = O p [t], (1) 

meaning that the decision A is determined by a decision operator O of par- 
ticular class p operating on a normative “truth ” function measured in a dimen- 
sion of a “freedom” variable /. 

Categorical truth norm 

Formal truth is two-valued. The variable of freedom is a measure of am- 
biguity . The operator admits or rejects, depending on whether the operand has 
one interpretation or more than one. In other words, a statement or set of 
statements is rejected categorically if any formal ambiguity exists. The truth- 
value is “true” if no ambiguity exists; conversely, “false.” The truth-value 
must be differentiated from the action operator, which may be “ admit if true,” 
“reject if false,” or vice versa. 

Threshold truth norm 

In experimental measures we shall decide the admissibility (i.e., “truth”) 
of an objective theory according to a truth function t — co(f) — C, A = Ji[co(f) 
— C], where w(f) is a confidence measure such that 0<tu(/)<l, C, 
a confidence norm, 0<C<1, and H(x) the Heaviside step-function. 

Functional truth norm 

At the level of object theories the admissibility may be determined more 
generally as described; for example, consider a Bayesian approach to a prob- 
ability estimate. As an act of policy admit a specific model that has two states 
(heads or tails) and one in which the occurrence of heads is uniformly at 
random with probability parameter />, 0 </><l. The probability estimates 
rr of occurrence of heads in the throw of a two-sided coin may be given by 

rr-fjn + Dtyp'O-fi)”- 1 #' ( 2 ) 

where n previous trials have resulted in heads occurring k times. Here it is 
to be pointed out that p represents a degree and range of freedom, that the 
integrand represents an operand of weighted “truth,” and that the operator 
J dp is the decision operator . It follows that tt ~- (k — !)/(?; -f* 2) is the so-called 



116 


NICHOLAS M. SMITH 


Laplace-Bayesian weighted “average.” This is a meta-decision, relative to a 
magnitude of measure associated with an object variable. Whether to bet or 
not would be a threshold decision. What odds constitute a fair bet would be 
determined by the two participants in a threshold decision that coincided or 
overlapped; that is, the decision to gamble was admissible to each party. 

Optimization truth norm 

Finally, if the operand is a measure of freedom directly (or probability of a 
level of freedom), the appropriate decidability operator will be the optimization 
(i.e. } selection of most favorable action). In value-decision theory this operator 
is that of practical decision; for example, select a strategy a from a range of 
possible action A such that 

Q{x, r) = sup {K(k ;**|*. r ) Q(y< *)}. (3) 

where .v, y are vector states of a system of states W selected from the same 
set, x c W f ycz W, r <s arc time variables, 0(. v, r), Q(y, s) arc vector sets of 
values of the respective states at times indicated, and K is a matrix of transition 
probabilities from states x at r to states y at 5 . 

2.1.3. The Ontological Characteristic 

The preceding concept of truth is naturally associated with the concept of 
existence: existence is established by the admissibility of objectifying statements. 
An “objectifying” sentence is one that literally creates a new concept. An 
example is found in Newton’s “laws” of motion. “Force” and “mass” are 
conceptualized and the relations between them described by the lav’s as the 
basis for a theory of mechanics. The admissibility of objectifying statements 
such as these is determined by an elaborate scries of tests which constitute the 
whole integrated structure of cognitive controls. 

Since existence is determined by cognitive tests, a hind of existence may be 
associated with each set of tests , including the appropriate truth function and 
decidability operator; for example, when a mathematician states a theorem 
beginning: “There exists a function, /(.v), such that ... etc.,” he does not refer 
to a substantive existence but is declaring that the statement to follow passes 
a test of nonambiguity with reference to the law of consistence (a cognitive 
control). 

We recognize three broad classes of existence of objects as determined by 
sets of tests of admissibility (and by traditional modes of inquiry) to which 
their objectifying statements are to be submitted. Formal objects (abstractions) 
are subjected to a class of formal controls, but they are nontestable with respect 
to empirical (we shall use the word “ cxtrospcctivc ” for a very good reason) 
tests. Substantive constructs (the “real” world of the objectivist) are subject 
to the whole gamut of tests — a process that will put drastic restriction on their 
representation. Finally; value objects are subjected to a set of tests made subtly 
different from the tests of substantive constructs by a property of formal duality. 



A THEORY OF THE CONTROL OF THE COGNITIVE PROCESSES 


117 


3. FORMAL DUALITY 

The property of formal duality is relevant to the topic of cognitive controls, 
since it leads to an understanding of the essentia] characteristic difference among 
concepts of substantive things, formal abstractions, and values and to the com- 
plementary nature of value and fact, subject and object, and cxtrospection 
and introspection. It is the objective here to give an insight into these conceptual 
characteristics rather than a precise mathematical formulation. 

Consider an objectified Euclidian space in /z-dimensian, in which a point is 
denoted by the vector q — {qu • ■ • , ?«)♦ Consider further a utility function 
of a specific system F(g , Tg , /), which is of the nature of a truth function in 
some measure of freedom previously described; t represents time, and T is an 
operator acting only on g (T may be a constant or ddt y etc.). 

The utility function F is usually nonlinear, so that a dynamic action problem 
utilizing this model would depend on the path traversed in object-space. From 
the viewpoint of cybernetic measures, this is not so simple as desired. Therefore 
ree seek through an enlargement of the model a simpler means of problem solving. 
This is accomplished by the “ Legendre transformation ” [5], provided F has 
certain required properties [6]. To each dimension object-space gi is added a 
new conjugate dimension pi such that a model in 2 n space (or “ phase ” space) 
is achieved which constitutes a “perfect” differential; in effect a path-depend- 
ent problem in //-dimensional object-space is replaced by a nonpath-dependent 
problem in 2/z-dimensiona! phase space — in other words, a dynamic problem 
is reduced to a static problem. 

3.1. The Legendre Transformation 

The Legendre transformation may be written in differential form 

d(p • Tg) -F d(T*p • g) = dF(g, Tg , /) + dG[p % T+p, !)= 0, 

where the operator T~ is “adjoint” [7] to T and where (.v*v) represents the 
scalar product consistent with the operators. T operates only on the primal 
variable g and T~ operates only on the canonically conjugate variable p . The 
function C(/>, T~p, /), complementary to F(g , Tg,t), is called the “dual” of 
F (the “ primal ”). The roles of primal and dual are interchangeable. 

Designating the partial derivative cF eg by 7^, etc., and equating varia- 
tionals wc have the generalized canonical equation: 

Tg^G p , 

p-= F Tq > 

The differential of the dual function dG makes up the deficit of the contribu- 
tion of dF to the perfect differential; for example, T~ i t T* = l f and (replacing 

i*by r)p=F,,q = G p . 



118 


NICHOLAS M. SMITH 


The following observations can be made: 

1. In a model in which F — F{q) only, the canonical variable pi conjugate 
to qi is interpretable as the holistic value per unit object; that is, the marginal 
value of a “thing.” Other models can be interpreted appropriately; the dual 
space is related to concepts of marginal values. 

2. The primal-dual relation is symmetrical; the dual could just as easily 
represent a value function in a space of some other kind of object. 

3. The differences between the adjoint operators T and T + and the dual 
functions and conjugate relation determine the essential differences and rela- 
tion between the concept of value and the concept of object . These differences 
orient values a priori and objects a posteriori, so that the rate of warrant is 
accomplished much faster for objects than for values over some — not all — of 
the taxonomic scale. Some objects on the taxonomic scale are directly perceived 
by our senses, that is, our senses establish a peremptory association with these 
objects. Values on the other hand are vindicated at usually a much slower pace, 
some processes spanning centuries. At the lower (atomic) end of the taxonomic 
scale, however, the time scales for vindication of system and antisystem are 
each much shorter than the interval of practical decision, and, furthermore, 
there is no peremptory association with either — hence both primal and dual 
systems are interpreted as conjugate objects. 

4. The function F is holistic with respect to a system of objects; the func- 
tion G is holistic with respect to a related system in conjugate space. This 
related system may be called the “adjoint” or “anti” system. 

5. For the case T = djdt , T+=—(d/dt) it is possible to transform the 
q variables only, keeping the q variables as parameters. The dual function 
H{p, q, t) is the classical Hamiltonian for the conservative system 

and the conditions for obtaining the perfect differential lead to the classical 
canonical equations. (Note that there are two Hamiltonian’s, one H(p } q) 
associated with the primal, one H + (p, q) associated with the dual.) We have 
H ~ — H~ and q — H v , p = — H q or, in symmetrical form, q = H p , 

6. No new information is contained in the dual over that in the primal; 
however, different forms of the problems may have different difficulties of 
solution, hence the dual, or the Hamiltonian, form may lead to more efficient 
solutions (obviously, also, they may be more difficult). It is frequently possible 
to find a form of the Hamiltonian that is time-independent; hence dynamic 
problems in object {q) space become static problems in phase (p, q) space. The 
introduction of the canonical conjugate variable simplifies the solution of 
valuative problems [8]. 

From the viewpoint of decision models the functions F, G , and H lead to 
stationary variational principles . These principles become major laws in the 
particular subject covered. 

For models in which the holistic utility function is linearly additive and 
unconstrained, the model already possesses the “perfect” form sought, and 
the dual variables are degenerate, becoming constants. 



A THEORY OF THE CONTROL OF THE COGNITIVE PROCESSES 


119 


4. FREEDOM 

The concept “optimal organization ” is proposed as the generalized and supreme 
value. Decidability and freedom, as operator and operand, arc complementary 
characteristics of optimal organization. Freedom, at the level of practical 
decision, is measured by the number of future object-decision alternatives that 
exist in the volume of hypcrspacc of feasible future decisions. In contrast to 
objective freedom, cybernetic freedom, at the level of meta-decision, is measured 
by the cybernetic capacity available, or free, for semiotic operations . 

This concept of freedom, we believe, subsumes all other measures of value 
associated with the strategic posture. Survival? It is the sine qua non of freedom. 
Beyond survival of the immediate threat, when survival is probable, lies the 
potential for surviving more remotely future stresses. Abundance, in a narrow 
range of circumstances, may be an approximate measure of freedom. “Adapt- 
ability” is the appropriate description of object-freedom. 

We must differentiate among measures of freedom, there being a measure 
of freedom associated at every hierarchical level, with every element produced 
by partition — except at the highest level. At the highest cybernetic level every 
cognitive agent is preprogrammed; there is no freedom, no problematic situa- 
tion requiring decision. The level at which the cognitive agent becomes pre- 
programmed is the basis for a taxonomy of systems [2]. 

There is no simple algorithm for the identification of the free variables in 
a system. This identification is part of the creative act of objectification. How- 
ever, the hierarchical structure of the system will impose restraints on the 
measures of freedom. There is a coupling, sometimes strong, between the 
measures of freedom (norms) at different levels: 

1. By an action we mean simply the modification of norms of subsystems 
and the allocation of free (uncommitted) resources to subsystems. 

2. By a reorganization of the object-system (i.c., an institutional reorganiza- 
tion) we hope to gain cybernetic freedom for the institution ns a decision- 
making body. 

3. By the reformalization of our decision models we expect to gain cy- 
bernetic freedom for ourselves as cognitive agents. 

4. By a re-evaluation we select new values (policies, strategics) as a means 
of increasing expected freedom in the composite systemic structure. 


5. A UNIFIED META-SYSTEM: THE COGNITIVE CONTROLS 

We can now assemble the meta-principlcs that serve to maintain or accomplish 

decidability in object -models. These principles arc controls in the true sense; 

they determine a course through the operations involved in selecting, evaluating, 

quantifying, and using a cognitive model for practical decision purposes. These 

controls arc meant to be a paradigm of that portion of the cognitive act, de- 

scribed as 5 * * * * * 11 rational,” associated with decidability. It omits any structured 



120 


NICHOLAS M. SMITH 


theory of the creative process. It is intended that the treatment be sufficiently 
general to encompass the entire range of behavioral response, hence of a unified 
meta-system. 

The cognitive controls are classified according to their roles in structuring 
the act of conceptualization. The controls being interdependent, such classifica- 
tion is not always clear-cut. The formal controls are categorical; that is, they 
must be satisfied or the decision object-model will be ambiguous. The associ- 
ated truth function is two-valued; the controls apply concomitantly with the 
practical action; that is, the information on which they are based is immediate 
information, looking neither forward nor backward. The extrospective controls 
concern the admissibility of raw data by transducers and the applicability of 
object-models for prediction of consequent object-states. The introspective 
controls are concerned with the admissibility of object-models with respect 
to strategies for action — prescription — and arise from internal requirements 
of the cognitive agent. Finally, the evolutionary controls are the application of 
the supreme directive — a drive toward freedom — in the selection of the entire 
portfolio of programmed responses of the cognitive agent. The controls are 
listed in Table 1. 


Table 1 

Canons of the Rational Process 


Formal — categorical 
Syntax 
Consistence 
Completeness 
Superjective 
identification 
Ontological parity 
Procedural invariance 
Testability 

Evolutionary — freedom 
Optimization 


Extrospective — noncategorical 
Criteria of fact 
Extrospective nonambiguity 

Introspective — cybernetic 
Problematic area 
Risk 
Rigidity 
Practicability 
Elegance 


5-1. The Formal Controls 

The formal controls determine the format of objectification. Historically, 
the oldest and most discussed is the control of consistency: an object-statement 
(words assembled in a well-formed string according to a formal syntax) and 
its negation are not both admissible in the same context. Such admission leads 
to conditionally universal ambiguity, that is, any state objectifiable in the given 
reduced model is possible simultaneously with any other. (Reminder: all controls 
are decidable with respect to a reduced model that is finite and discrete hence 
an inconsistency does not imply any situation not describable in the context 
of the given model.) 



121 


A THEORY OF THE CONTROL OF THE COGNITIVE PROCESSES 

The demand for completeness requires that the format of the finite-discrete 
model permit a statement about any of the objectified states of the system, 
( Excluded : “Have you stopped heating your wife?”) The form of the model 
must therefore be tautological, that is. disjunctive normal. Statements like 
44 p implies q" are nonformal. They are admissible under some other control 
(say, extrospectivc). 

The systemic properties of concepts require the identification of the larger 
hierarchical echelon (the “superjcct” — after Whitehead) as a context that 
modifies and contributes to the identification of an object, hence the control of 
supajectk c identification. A particular objectification must be a member of 
some more inclusive hierarchical assembly; it mat have one and only one super- 
ject which cannot be itself. Failure to satisfy this control can lead to paradox 
because of the implied ambiguity. (Example: the male barber who shaved 
even' man except those who shaved themselves. Solution: Mr. B is a barber 
only when he is a public-sendee professional shaving other men. He shaves 
himself as an individual. The President has no constitutional authority to 
discipline Lynda Bird: this role belongs only to her father. Mr. Johnson.) 
Because the format of objectification can have identical formal properties from 
echelon to echelon, logicians sometimes attribute a reflexive property to formal 
systems that does not exist in substantive systems. This is the source of the well- 
known paradox about sets that are not members of themselves. Differentiation 
of the echelons of synthesis will remove the ambiguity [9j. 

The control ontological parity requires that every term of a statement be 
subject to the same tests of admissibility (existence). By 44 term” is meant 
strings connected by logical identity, logical disjunction, algebraic equality*, or 
algebraic summation (difference). Since there are many kinds of truth measures 
corresponding to sets of conditions for decidability, it is possible to violate this 
requirement. A single term may contain subclauses about objects not ontologi- 
cal Iy parallel, provided there are operators that provide the entire term with 
the appropriate status. (Example of violation: 14 The probability of heads in the 
throw of a coir, is the limit of the ratio of number of heads to number of throws 
as the latter goes to infinity.” This quantity is a formal parameter, untestable 
by substantive means. Violation: 44 Happiness is a puppy dog.”) This control 
is a generalization of Moore r s admonition to avoid the Naturalistic Fallacy [10] 
and of the well-known principle of dimensional parity of physical equations. 

Procedural invariance is a generalization of Einstein's principle of invariant 
transformations: under conditions in which procedure is not a part of the 
objectification, prediction or prescription must be independent of the procedure 
of analysis. In a theory* of space-time it leads to the property of relativity. In 
dedrion theory* it leads in stochastic models to a restriction of the Chapman- 
Kolmocorov relation, which is invariant to translation in time, and subsequently, 
with the practical decision operator, to the principle of optimality of time- 
dependent programming in order to achieve an invariant form of a valuc-deririon 
equation, provided that the decision operator distributes to the right and that 
the strategy in each stage is selected independently. (It does no: apply, there- 
fore, to the traveling salesman problem.) We have shown that Unite stochastic 
models define a space-time metric that is inherently relativistic [11]. 



122 


NICHOLAS M. SMITH 


Last, we have a formal requirement of testability in principle ; that is 
testable in some sense for ambiguity. A concept formulated so that it is 
nontestable is literally nonsense, since by lack of this characteristic it has no 
connectability to the predictive or the prescriptive processes. In short, we are 
defining “sense” to mean having this connectability. It is admissible to 
verbalize about nonconnectable concepts. However, such concepts can have 
nothing to do with acts of prediction (of substantive events) or prescription 
(of practical decisions). 

It may appear to some readers that our primitive concepts fail to meet the 
requirement of testability. This is not the case. It is true that primitives do 
not permit a priori testing before their initial use. The viability and competi- 
tiveness of the decision system itself constitutes a test of these primitives. What 
is excluded by this control are concepts that by their very nature have no intr- 
actability or rational connectivity to the general universe of objects, or concepts, 
the testing of which is excluded by their nature. 

The requirement of ontological parity combined with the requirement of 
testability has some interesting consequences with respect to the representa- 
tion of substantive concepts. Testability rules out infinite procedures in testing. 
Hence the limiting probability of heads of the throws of a coin as the number 
of coins goes to infinity cannot be a substantive concept. Such a concept must 
be regarded as a purely formal device. These restrictions, together, eliminate 
as substantive concepts the notion of continuous space-time. Continuity in 
either space or time has the status of a formal abstraction, since these constructs 
are intrinsically nontestable by experimental means. “Real” time and “real” 
space must be discrete [11]. 


5.2. Extrospective Controls 

The admissibility with respect to noncategorical extrospective controls is 
determined by a truth threshold function measured by a warrant on a continu- 
ous interval in comparison to a norm established by other (introspective) 
requirements. 

In an infinite class of objectifications the cognizing agent must select one 
objectification as a matter of policy on which to base further action. This 
particular objectification exists as a pretheory in terms of which we establish 
the criteria-of-fact. The criteria-of-fact constitute the (specifications of a) 
filter. This filter transmits that extrospection allowed by these preselected 
specifications as “relevant,” that is, relevant with respect to a preselected 
range of problematic situations. Thus, even before any observations are made, 
the element of policy and the element of form have entered into the cognitive 
act. The malfunction of this process may lead to an instrumental ambiguity. 

The test for instrumental ambiguity is itself risky: is the action to be taken 
stable (nonambiguous) with respect to the admission of randomly selected 
information normally excluded by the filter? Every organization, entrepreneur, 
or individual must constantly test his acts for instrumental ambiguity. Ones 
information is always biased by the filter; hence redundance in action models ts 
essential. 



A TH FOR V OF THE CONTROL Of THE COGNITIVE PROCESSES 


123 


The particular objectification and filter selected leads through the pre- 
dictive procc c s to an expectation. Except in the most simplistic of models there 
is never complete agreement between expectation and extrospection. An 
inadmissible difference between these two is extrospective ambiguity. Given a 
preselected problematic area and a preselected level of confidence determined 
by introspective means, a measure of extrospective admissibility is determined 
by a threshold operator on a truth function comparing a measure of confidence 
with a preselected norm. The determination of the measure of confidence is a 
problem in the design of experiments and statistical inference and depends on 
the particular problem area and formal model chosen as acts of policy. 

The details of this process arc beyond the scope of this paper. The selection 
of the formal model, of the problematic area, and of the risk of failure are all 
nonextrospeeme acts Therefore, as with the control avoiding instrumental 
ambiguity, the “solid fact” is not so solid, but contains factors of extrospection, 
form, and value. The important points arc that the truth value is continuous 
on the rangle 0 <T / < 1, never wholly false, never wholly true, that is, never 
without risk % that the decidability operator is the threshold operator, and that 
definite models, cither deterministic or stochastic definite, are not appropriate 
and in fact do not occur. The theoretician is engaging in two acts simultaneously: 
evaluating his model and evaluating parameters in the model. The truth value 
associated with the latter act is functional. Because, frequently, the theoretician 
announces only the final product of his act, he may be deluded into believing 
that his last iteration is an appropriate description, and the preparatory' work, 
done covertly and not consciously programmed, is mere thrashing around. 
The present emphasis on Bayesian analysis is a move in the direction of overtly 
programming the procedure involved in assigning numbers to formal parameters 
for use in practical decisions. Unfortunately, mathematicians are still suffi- 
ciently influenced by the old, deterministic, objective school that they feel 
called on to describe their operations in objective terminology'. 

5.3. Introspective Controls 

The introspective controls arc generated by the internal requirements and 
limitations of the cognitive agent. These we might refer to also as the “cy- 
bernetic controls,” since they can be linked to the limitations of finitencss on 
the control organ itself. The tendency is to regard introspection as an aesthetic 
process, somehow mysterious — an organic drive of the species — and to be 
accepted as an innate characteristic of the decision maker. It is all of these things. 
Yet when we regard introspection in the context of the finite cognitive agent, it 
begins to appear as characteristics imposed by finitencss. 

The selection of the problematic area is a strategy of composition of the whole 
portfolio and involves a trade-off between practicability and scope, that is, the 
resolving of the immediate problem here and now or the placing of some in- 
vestment that may make simple a similar or related situation some time in the 
future. 

Risk with respect to possibility of failure to achieve goals is also determined 
with respect to the entire repertoire of programmed responses. Here the 



124 


NICHOLAS M. SMITH 


concept of balance leads to a personal strategy of risk. The risk associated with 
a practical act is related to the risks associated with alternative practical acts 
with alternative decision and predictive models, and with the risks associated 
with the null-act, that is, letting nature take its course. Here we are concerned 
with the balance between the risk associated with the practical act and the 
risk associated with failure of the decision model. 

Rigidity concerns adherence to a strategy despite a recent history of adverse 
results. There is a basic compromise in the cognitive agent between rigidity and 
pliability. If he is completely rigid, he is nonadaptive. He seeks to make the 
world of experience conform to a concept of that world which he holds. If 
he is too pliable, then he is essentially unprincipled and his responsiveness is 
determined purely with respect to immediate rewards. Because of the un- 
certainty of appropriateness of models and because of the stochastic nature of 
real life, some strategy between the two extremes is optimal. This strategy 
tends to be set early in the lifetime of an individual person and becomes a 
characteristic personality trait. 

As the cybernetic limits of the control agent are reached he becomes saturated 
and cannot increase the range of his responsiveness except through a reorganiza- 
tion of his programs, looked at as a total repertoire . The cognitive agent seeks 
to develop that set of control models, principles, and algorithms, etc., that 
consume minimum cybernetic capacity. This is the meaning of the drive 
toward elegance . In this respect a set of very simple models does not constitute 
elegance, since they are very inefficient in the utilization of cybernetic capacity. 
The cognitive agent, when filled to capacity with such programs, is not provided 
with a wide range of response. However, if a number of simple, elemental 
models are replaced by a single, somewhat more complicated model, elegance 
is achieved if the result is a net decrease in cybernetic capacity by replacing the 
many inelegant models with the one more complicated model. This provides 
the cognitive agent with an excess capacity (cybernetic freedom) for the develop- 
ment of new programs which may enlarge the range of his responsiveness. 

When such a reorganization results in a sharp decrease in the demand for 
cybernetic capacity, a “ pattern ” will have been created. Thus pattern formation 
may be described in terms of the optimization of a total repertory of programs 
with a simultaneous minimization of demand on cybernetic capacity. 

5.4. Evolutionary Controls 

These are the optimization operators (max, min, sup, inf, and maximin, 
etc.). They operate to consume the immediate freedom by an act that opti- 
mizes measures of future freedom. These are the u measures of effectiveness.’ 
We call them the evolutionary controls because a minimal measure of freedom 
is the probability of survival of a future threat and a measure of adaptivity may 
be interpreted as a general measure of future freedom. Probability of survival 
as a measure implies a finite probability of immortality. Such a measure is a 
legitimate value of substantive systems only if the certain death of the system 
lies so far in the uncertain future that the projected rational act is independent 



A THEORY OF THE CONTROL OF THE COGNITIVE PROCESSES 


125 


of prediction. Otherwise its use in a traumatic experience may lead to a cata- 
strophic loss of the power of decision [10]. Safer values arc found in a sequence 
of finite goals. 


6. ANALOGICAL CONFORMITY 

Substantive theories all have analogous object-properties brought about by the 
very restrictive character of the formal cognitive (categorical) controls. The 
controls, being collectively necessary conditions for nonambiguity, will impart 
to even’ admissible model a set of analogous properties. This characteristic 
we call the principle of analogical conformity [11). The property of formal duality 
leads to stationary’ variational principles such as the conservation laws in physics 
(action, momentum, energy). The analog of these principles may be found in 
any well-formed theory. The relativistic properties of space-time are exhibited 
by a simple Markov system and arise from the same cognitive requirements 

m 

The conjugate relationship between value and fact is strictly analogous to 
the relation between momentum and position in physics. Indeed, wc may 
regard the concepts “ momentum ” and “force” as value concepts in a model 
reduced to the very simplest properties. Conversely, the Heisenberg uncer- 
tainty principle applies to the relation between inventor}’ and marginal value. 

When there exists a fairly consistent taxonomic scheme of substantive things, 
there is no such integrated taxonomy of values. When it is realized that at some 
level a valuc-conccpt-space stands in canonical conjugate relation to an object- 
space, it is always necessary to qualify any value statement by the identifica- 
tion of the object-level at which the primal-dual relation exists. This is a source 
of confusion in value statements, particularly between “utility,” “value,” 
and “ethics,” yet we have also seen that “force” and “momentum” have a 
place on the taxonomy of values! The latter, however, has a dual relation to 
object space in a very special subclass of highly reduced models. 

7. THE EDGE OF REASON 

Man faces a fundamental uncertainty: the more he improves his practical 
decision process, the more are the problems in the future he will have to decide. 
Is this process divergent, stationary, or convergent? If it diverges, man is be- 
coming freer but less able to cope rationally with his world; if it converges, man 
is becoming more rational but has fewer problems to which to address himself. 
He seeks to be neither wholly free (hence completely subject to the uncertain 
whims of fate) nor completely decidable (hence reduced to the status of an 
automaton). He must choose a strategy that maintains some compromise 
between the two extremes. 

Today, as a result of an unprecedented allocation of resources in objective 
research, we arc witnessing a technological explosion in which man is discover- 
ing alternatives more quickly than he can choose intelligently among them. 
To counteract this surge of freedom we arc developing vast management 
complexes. These management systems in turn are severely obstructed by their 



126 


NICHOLAS M. SMITH 


own cybernetic limitations. Internal communication is limited, and the rational 
processing of decision problems is drastically retarded. As a result many 
major acts come by default; man has surrendered his prerogatives because his 
organizational cybernetic capacity is saturated. Therefore, in spite of the best 
of policy and the most capable of managers, decisions that should be made to 
maintain a level of control do not get made. Obviously something is wrong with 
our strategy. 

We can distinguish between objective freedom and objective decidability 
and subjective freedom and decidability. The dilemma described occurs as a 
result of present exclusive orientation toward objective concerns. 

The subjective, or cybernetic, limitations are those that today constitute 
the edge of reason. To restore the balance between freedom and decidability 
we must shift the weight of our resources from technology to the development 
of the science of decision itself. And here we do not refer to the blind and naive 
application of outmoded methodologies. What is needed is an increase by many 
orders of magnitude in our cybernetic capabilities through bold and sophis- 
ticated programs. 

The present edge of reason, like fog to a shipmaster, obscures the natural 
horizon. Every fixed cognitive agent possesses an ultimate limit to his cyber- 
netic efficiency. Similar to objective entropy, there is an internal or subjective 
entropy determined by the probability of reobjectifying the entire portfolio 
of programs into a new set that will increase cybernetic freedom. A principle 
of subjective entropy states that this probability decreases with each act of 
concrescence. Thus the cognitive agent faces an ultimate internal death as 
a free agent, just as man faces an ultimate external entropic death. 

Man may extend his responsiveness by adding capacity external to himself 
(libraries, schools) by professional specialization. The ultimate freedom of man 
may rest, however, with the possibility of developing machines capable of 
increasing his creative capacity in kind. If we observe the orderly progress of 
biological evolution and admit the formal arguments by von Neumann [12] 
that machines can make new machines that are smarter than the originals, the 
inner entropic death of man may be postponed until the day arrives when 
cosmic free energy is unavailable. 

As long as man is free, his system of rational responsiveness will ultimately 
rest on some general principles whose vindication remains to be accomplished. 
He can be rational only within a small enclosure in time and space, and within 
limitations of his own mind, the border of which constitutes horizons for rational 
response. In spite of this horizon, he is forced to guide himself by basic principles 
that lie continually beyond the edge of reason. 


8. ACKNOWLEDGMENT 

I wish to express my great appreciation to Mr. Milton C. Marney, my colleague 
and collaborator in this study area (The Foundations of the Prescriptive Sciences) 
for his very helpful discussion of these ideas and his careful criticism of this 
article. 



A THEORY OF THE CONTROL OF THE COGNITIVE PROCESSES 


127 


9. REFERENCES 

[1] Kurt GodEL, On Formally Undeadahle Propositions, translated by Mcltzer, Basic 
Books, New York, 1962 

[2] X M Smith and M C Marnea , “ Management Science — An Intellectual Inno- 
vation/* Research Anahsis Corp , McLean, Virginia, TP-43, August 1961. 

[3] M C Marnta and N M Smith, “The Domain of Adaptive Systems: A Rudi- 
mentarx Taxonomy/’ Yearbook of the Societv for General Systems Research, Vol. 
IX, 1964 

[4] A X Whitehead, Process and Reaht\, The Social Science Book Store, New 
York, 1929 

[5] C. LanczOS, The \ r anational Principles of Mechanics, University of Toronto Press, 
2nd ed , 1962, pp 161-164 Sec also R Courant and D. HiLBnrr, Methods of 
Mathematical Physics , Vol 1, Interscicnce, New York, 1953] pp 231-242 

[6] H W Kuhn and A W Tucker, “Nonlinear Programming/' Proc 2nd Symp. 
Math Stat Prob , J Ncvman (Ed), University of California Press, Berkeley, 
1961, pp. 481-493 Sec also G P. McCormick, “Second Order Conditions for 
Constrained Minima,” Research Analysis Corp , McLean, Virginia, in publication 

[7] L B Raul, “On Complementary Variational Principles/* MRC Rept. No. 55S, 
Mathematics Research Center, U.S Armv, The University of Wisconsin, 1965. 
See also J. Lewis’s, Importance The Adjoint Function , Pcrgamon, New York, 
1965, Appendixes, pp 152-157, and A Messiah, Quantum Mechanics , Vol. 1, 
Wiley, New York, 1965, pp. 254-266 

[S] L S Pos’TRA \cis et al , 7 he Mathematical Theory of Optimal Processes , translated 
by TnrogofF, L. W. Nlustadt (Ed.), Interscicnce, New York, 1962, pp 17-73. 

{9] A. N, Whitehead and Bertrand RussrLL, Pnncipta Mathcmatica , Vol I, Uni- 
versity Press, Cambridge, 1960, Chapter 2 

[10] G. E. Moore, Pnnapia Fthica, University Press, Cambridge, 1954, p. 13. 

[11] N. M. Smith, “A Calculus for Ethics — A Theory of the Structure of Value,” 
Parts 1 Sc II, Behauoral Science, I, No. 2, 111-142 (April 1956); 1, No 3, 186- 
211 (July 1956). 

[12] J. Myhjll, “The Abstract Theory* of Self-Reproduction,” Views on Genera I 
Systems Theory , M. D Mesarovic (Ed), Wiley, New York, 1964, pp. 106-118 
See also J. von Neumann*, “Probabilistic Logics and the Synthesis of Reliable 
Organisms from Unreliable Components,” Automata Studies, C. Shannon (Ed ), 
Princeton, 1956. 


UNE THEORIE SUR LE CONTROLE 
DES MOYENS DE PERCEPTION 

r£sum£ 

La decision d’ordre pratique se fait mesure d’un module dc decision amenage 
uniquement pour quclquc probleme dc discussion. La congruite du modelc 
“ reduit” subordonne au nombre dcs critcriums; Sur quoi se fonde admissible 
1c module formcllement? Au probleme dc discussion, est-il approprie? Choisi 
a priori son cmploi est-il proportionne a risque? Uniquement conduit-il a 
une action (que Ion peut decider?) Est-il le meillcur di^ponible (ct en quel 
sens?) Cc> decisions (Uappclaicnt 44 mcta-dccirions ’*) son? rcglecs par un 
ensemble dc rcglcmcntation cognativc amunage a fournir un base non-amhigu 



128 


ROBERT WILSON 


pour les choix a tous les niveaux hierarchiques, Parmi telle reglementation 
sont la consistence et Fetat complet. L’observation des idees generales pourtant 
d’une caractere de systeme conduit a autres sources lesquelles ne sont pas 
susceptibles de decision et d’ici de la reglementation cognative. Les resultats 
de ^application de cette reglementation ont les influences vastes. Ils com- 
prennent: (1) la restriction de quantification des idees generales substantives 
(le “monde reel”) a mesures finies, discretes (fractions rationales), (2) un 
ensemble formel des caracteristiques pour tous les modeles en commun (le 
principe de conformite analogique), et (3) un ensemble des principes 
introspectifs qui prend sa source des limitations cybernatiques d’un agent fini 
cognatif. La reglementation cognative derive de la necesslte a resoudre une 
dichotomie fondamentale entre que Ton peut decider et la liberte. 


OPTIMAL DIVIDEND POLICY 

Politique de la .Distribution Optimale du Dividende 

Robert Wilson 
United States of America 


1. INTRODUCTION 

The purpose of this article is to examine the ramifications of statistical decision 
theory in a market context for certain group decision problems encountered by a 
firm in choosing its financial policies. The discussion draws on recent work 
in finance and the economic theory of capital markets as well as analytical 
results in two recent articles of mine [19], [20], 

Dividend policy is used throughout as a test case in which to reconcile the 
prevalently opposing prescriptions of statistical decision and economic theory. 
In the process we examine the special role of corporate securities as insurance 
instruments and then develop the “ clientele theory” of dividend policy via a 
statistical decision-theory approach that is compatible with the Miller-Modi- 
gliani theory of dividend policy derived from the economic theory of capital 
markets [12], [13]. 


This study was supported in part by funds made available by the Ford Foundation 
to the Graduate School of Business and in part by the Atomic Energy Commission via 
a grant to the Operations Research Program, Stanford University. The conclusions m 
this publication, however, are those of the author and are not necessarily those of the 
Ford Foundation or the AEC. 


OPTIMAL DIVIDEND POLICY 


129 


2, INDIVIDUAL BEHAVIOR IN A MARKET CONTEXT 

The essential features of the methodological discrepancy between the two 
approaches can be illustrated by considering the behavior of an individual in a 
perfect risk market, as studied by Arrow [1]. In such a market there exist per- 
fect instruments; namely, each security provides a claim to a unit of income in 
one and only one 11 state of the world,** and together they cover all states of the 
world. 

As a simple example, consider a perfect risk market in which a number of 
individuals are indexed by Let £/*(*) be the 7 th individual’s 

utility function for income, and suppose that he assesses a probability distribu- 
tion fi(0) over the states of the world { 0 }. He is considering a decision a that 
would result in a payoff pi(oc J 0) contingent on the state 0 . Now, if for each state 
0 he can buy or issue securities in amounts ln{0) at the market price tt( 0), his 
decision problem can be posed as one of maximizing 

im U t [pt(* 1 0) + f H {0) -2 *i(fl v($)] (l) 

0 t 

by choosing the decision a and the amounts of the securities ht(0). Observe 
that the securities play the conventional role of insurance. In the market as a 
whole the prices tt(0) arc determined to balance the claims in each state, 
namely, 

2//,(0) = O or ?Ju(0) = h n (0), 

i i 

there is a net gain ho(0) from production in state 0 . 

In the simplest ease the decision a is determined by the differential con- 
dition 

IMG)V'i{---)Pi(cc\0)=0, (2) 

0 

reflecting the methodology of statistical decision theory. Nevertheless, simul- 
taneous optimization of the security amounts yields the auxiliary condition 

m £/;(•■•) --))*{()) ( 3 ) 

o 

for each state 0. Hence the differential condition (2) for the optimal decision 
a reduces simply to 

y *(%,;•(<* 1 0)=o. (4) 

In this form of the optimizing condition we have the epitome of the approach 
through economic theory. Normally, a condition such as (4) is derived in 
economic theory on the grounds that any other condition w’ould permit riskless 
arbitrage in the market: more is said about this later. 

For our purposes the essence of the reduced condition (4) is that the in- 
dividual’s decision a is determined by objective market opportunities embodied 
in the price system {rr(0)} and is independent of his subjective preferences 



130 


ROBERT WILSON 


embodied in the utility function and the probability assessment; indeed, at best 
we could say only that the individual acts as though he had a linear utility and a 
probability assessment identical with the price system.f 

The possibility of insurance through perfect instruments plays a crucial role 
in this analysis, and it is only through his insurance strategy that the individual 
evidences his subjective preferences. In practice, perfect instruments are 
rarely used to cover all contingencies; consequently, in the next section we 
investigate the role of corporate securities and the possibility that they may be 
surrogates for perfect instruments. 

To indicate the direction that the discussion follows, it is worth considering 
the implications of the reduced condition (4) for the behavior of a firm. Although 
there is only scanty evidence that a firm might possess preferences like an 
individual, it is at least clear that if a firm did have such preferences and in- 
surance opportunities were available then in any case its decision a would be 
determined in the same fashion via the reduced condition (4). Nevertheless, 
it is not necessary' for the firm to have such preferences since objective market 
opportunities based on riskless arbitrage exist for the firm just as well as for 
any other agent in the market. Alternatively, an argument for the sufficiency 
of the reduced condition (4) can be based on the characteristic financial structure 
of corporations. A financing mix of bonds and stocks results in a linear sharing 
rule among the owners composed of a fixed interest payment on bond holdings 
plus a proportional share of earnings; thus, if each owner i receives an income 

pt{a | 0) = a{(0) + pipo{a | 0), (5) 

contingent on state 0, where po is the firm’s earnings and /Jf is owner i s pro- 
portional share, then every owner would from his own preferences determine the 
firm’s decision a via the condition 

Z*(0)Po(«\e)=o, ( 6 ) 

o 

implying unanimous consent.^ We shall develop more fully below the rationale 
for linear sharing rules. 


3. THE ROLE OF CORPORATE SECURITIES 

From the last remarks, it follows that from the viewpoint of a group decision 
theory the characteristically linear sharing rules of corporations may act as 
mechanisms for inducing consensus, at least in a risk market with perfect instru- 
ments. In this section we probe more deeply into the special role of corporate 
securities (bonds and stocks) to establish that, in fact, they may suffice in place of 
perfect instruments to insure individuals in the market. 

f It is readily verified from (3) that tt ( 6 ) > 0 and 2 77(0) = 1- 

0 

% In practice, if earnings are not fully paid in dividends, this argument requires earn- 
ings to be fully reflected in the market value of shares. 



OPTIMAL DIVIDEND POLICY 


131 


Consider a group of individuals who must make a decision a in common from 
which they will receive a payoff /> 0 (a | 0) to be shared jointly among them, 
contingent on the state 6 . Assume that the normative criterion of Pareto opti- 
mality is imposed to determine the form of the sharing rule. A condition for the 
determination of the Pareto-optimal sharing rule has been obtained by Borch 
[2], and under restrictive assumptions I have in [19] developed various 
extensions; also, in [20] comparable results are obtained fora highly simplified 
dynamic model. In brief, Borch’s condition is that there exists a weight A* for 
each individual i and a function fx(x, 6 ), when x ~ pa(a | 0) is the payoff in state 
0 , such that for each (.v, 6 ) the share $/(.v, 9) to member i from x is determined 
by 


A Jt(<nUl[s&x,6)]==rix 9 g) (7) 

for each i, using the notation of Section 2 for the individuals* utilities and prob- 
abilities. That this condition holds also when the Pareto optimum is determined 
via a price system can be seen by comparing (7) with (3), from which it is evident 
that each individual’s weight is the reciprocal of his expected marginal utility 
when a market mechanism is used and tt(8) = /i(.v, 9). 

Now we can ask, in this situation, in which the sharing rule can be perfectly 
general except for the condition of Pareto optimality, will it in fact be necessary 
to establish something like a system of perfect instruments? The results in [19] 
can be interpreted as a qualified no. First, it is shown that the shares depend 
on the state 0 only if the individuals disagree in their probability assessments, in 
which case the dependency takes the form of side bets with stakes of either 
income directly or shares of the payoff. Since it is well known that whenever 
two individuals disagree about a probability- assessment there exists a mutually 
profitable bet between them, this does not represent any significant requirement 
for an insurance mechanism except as a means of resolving disagreements 
among judgmental probability assessments (which is what makes a market). 

Second, it is shown that, conditional on the state obtaining, the marginal 
share (c*rj(.r t 0)(&v) of the payoff for each individual) is in inverse proportion to 
his risk aversion, as defined by Pratt [15]. In general this result can imply that 
nonlinear sharing rules arc Pareto optimal; nevertheless, there arc two further 
results that temper this conclusion. One is that for a large class (including 
quadratic, exponential, logarithmic, and monomial) of utility functions linear 
sharing rules obtain from which it is reasonable to suppose that in most cases the 
Pareto-optimal shares are approximately linear or at least that linear shares arc 
approximately optimal. The other is that in order for the group’s behavior to be 
consistent with the Savage axioms for consistent decision making under un- 
certainty [16] it is necessary and sufficient that the shares be linear in the payofT.f 
Although the full implications of this result remain unclear at present, it might be 
argued that to the extent Pareto optimality interferes with consistency we might 
scrap Pareto optimality in favor of linear shares (alternatively wc might argue 

t It is Savage's fourth postulate that is in jeopardy, which corresponds to I>ucc and 
Uniffa’s substitutability axiom [11]. 



132 


ROBERT WILSON 


that corporations should provide more securities that share nonlinearly; f 0r 
example, convertible bonds, warrants, participating preferred). 

In summary of the above points we can conclude that a system of perfect 
instruments is largely unnecessary 7 . Pareto-optimal allocation under uncertainty 
of the returns from a productive enterprise can be achieved via shares in the 
payoff that are approximately linear, provided that opportunities exist for side 
bets to resolve judgmental disagreements. A formal insurance mechanism 
therefore, can be confined to contingencies affecting individuals independently 
of productive returns (such as death, casualty, or hazard). Moreover, the 
determination of an individual’s sharing proportion is in effect largely a matter 
of his own initiative in matching it to his risk aversion. 

The thrust of these conclusions is that, from the point of view of indhiduals 
in the market, the usual forms of corporate securities (bonds and stock) can 
play a major role as a means of providing contingent claims. The financial 
policies of a firm are therefore of interest to an individual in the market because 
they affect his means of insurance. 

The foregoing reflects the statistical decision theory' approach. In contrast, 
economic theorists have reached an asymmetrical conclusion; namely, that the 
subjective preferences of a firm’s shareholders are irrelevant in the sense that 
financial policies such as leverage and dividend policy 7 do not affect the valuation 
of the firm. Our next task, therefore, is to examine this theory and to demon- 
strate how it can be integrated with the conclusion of statistical decision theory. 
The results, we shall see, indicate in what sense there can be a theory of group 
decision making in a market context. 


4. THE MILLER-MODIGLIANI THEORY OF 
DIVIDED POLICY 

Let us, then, consider a market for corporate securities in which are available 
from each firm bonds yielding a stated prescribed return in each state and stocks 
yielding proportional shares of declared dividends in each state. Each individual 
in the market may be presumed to have a subjective probability assessment over 
the possible states of the world and a utility' function for streams of income. 

Now the fact that the preferences of individuals relate to streams of income 
produces certain points of contention which to eliminate requires us further to 
distinguish between two cases of market structure. In the first, which we shall 
not be pursuing further, individuals may not borrow at the market prices; that is, 
they cannot issue bonds at the market rates for corporations, even though the 
patterns of returns over the states are identical. In this case, even given the 
market price system for corporate securities, an individual’s preference re- 
lations among alternative portfolios are determined only r conditionally on the 
financial policies of firms, including leverage, dividend, and investment policies. 
This is clear from the fact that the time stream of income from a security, 
especially the stock of a corporation, depends on the firm’s financial policies. 
Although this case is not studied here, it is nevertheless of considerable interest 



OPTIMAL DIVIDEND POLICY 133 

because of the discrepancies in practice between corporate and individual 
borrowing rates. 

In the second case, in which individuals may issue securities at the market 
prices for identical state patterns of returns, the role of individuals’ preferences 
for time streams of income is eliminated. This is because, given an individual's 
wealth as measured by the price system, he can buy or issue securities to dis- 
tribute his wealth over times and states to achieve a preferred pattern of income. 
This feature has been studied in the special case of complete certainty by Fisher 
[5] and provides the basis for his theory of interest. It has also been extended 
to the case of uncertainty by Hirshleifer [7], [8] who used Arrow’s theory of 
risk markets [1]. The result, of course, is for each individual an induced utility 
function for present wealth and a set of induced preference relations among 
portfolios, based on the market valuation of securities. The discussion to 
follow is confined to this second case of market structure. 

With this type of market structure, an individual’s preference relations 
among alternative portfolios of securities depend only on the price system and 
not on the financial policies of firms directly. For any given price system, 
therefore, there is a resulting demand for each security of each firm, and in equi- 
librium the price system is determined to equalize the demand and supply^ of 
each security. 

Because wc are interested here in the financial policies of firms and, in 
particular, in the problem of dividend policy, the pertinent question at this 
point is whether the financial policies of a firm significantly affect its valuation 
in the market by the price system, and it is this question to which we now turn. 

On the subject of dividend policy, the major work in the economic theory of 
capital markets is the pair of seminal papers by Modigliani and Miller [13], 
[12]. In these papers they establish two basic propositions about a perfect 
capital mnrkctf: 

1. The market valuation of the firm in total is not affected by its mix of 
financial instruments (stocks, bonds, etc.). 

2. Given its investment policy, the valuation of the firm is not affected by 
its dividend policy. 

The first proposition follows essentially from the observation that a com- 
modity cannot sell for more than one price in a market; in this case the argument 
is that, since an investor can hold the firm’s instruments in the same proportion 
as the financing mix, it follows that the valuation of the income stream of the 
firm must be equal to the sum of the valuations of the income streams allocated 
to the instruments, regardless of the mix. 

The second proposition comes in two parts. First, it is irrelevant whether 
uninvested funds are used for dividends or repurchases of shares, since the 
investor can always buy or sell shares to obtain the actual cash income he desires; 

f A perfect capital market is the second case of market structure described above 
augmented by assumptions climinatinp such market imperfections as transfer costs and 
differential tax effects, plus an assumption called “symmetric market rationality’'; none 
of these assumptions is reviewed here. 



134 


ROBERT WILSON 


that is, share repurchases increase share prices by the amount of the dividend 
that otherwise could have been paid and wealth is unaffected, barring differential 
tax treatments of dividends and capital gains. The second part of the argument 
is that using dividends to change the financing mix (c.g., by borrowing to pro 
dividends) is irrelevant because of the first proposition. 

It is imperative to observe that the validity of these propositions depends onlv 
on the presumption that each investor prefers more wealth to less; hence (as is 
always the equilibrating force in a market) he will take advantage of any oppor- 
tunities for arbitrage, should the propositions fail to hold. Moreover, such 
opportunities are objective in the sense that they depend only on market valu- 
ations embodied in the price system and depend not at all on subjective prefer- 
ences compounded from the artifacts of statistical decision theory. 

In view of the evident simplicity, transparency, and force of Miller and 
Modigliani's propositions, it is surprising that they have stirred such com- 
motion in the theory of finance (e.g., see the collection of papers in the journal 
of Finance , May 1963). Some, [4], [10], [17], and [18], have raised the problems 
of market imperfections to question the valuation theory; Gordon [6] and Walter 
([18], p. 285) have insisted on tying dividend policy 7 to investment policy. In 
most of these discussions, however, there is a pervasive hint that the authors just 
cannot believe that the policies of the firm should not in some sense reflect the 
preferences of the owners, that is, that the decision theory 7 approach is vacuous. 
It is thought that the owners of a firm dictate its preferences; but, in fact, in a 
market ownership it is irrelevant as a decision criterion, for it is solely a means of 
distributing wealth, for which the packaging is irrelevant, and wealth is deter- 
mined solely by f productive opportunities. Hirshleifer [7], [8] has emphasized 
this last point in his extension of Fisher’s analysis. 

From these remarks it is abundantly clear that I agree with the Miller- 
Modigliani propositions, so it may come as a surprise that I notv want to go on 
to obtain normative prescriptions for the optimal dividend policy of a firm in a 
market. The route of others has been to suppose the existence of strong effects 
from market imperfections (differential tax rates, transaction costs), for example, 
Lintner [10].f Results in this direction, of course, depend on questions of fact 
regarding the magnitude of such effects, for which there is only inconclusive 
evidence. We shall take a different tack by considering the “ clientele theory M 
mentioned as a possibility by Miller and Modigliani ([12], p. 431).J 


t There is also the school of thought led by Donaldson [3], who has criticized mana- 
gerial practices focused on customary “debt limits ” and opportunistic uses of dsbt 
financing to acquire funds in times of heavy financing, as revealed in extensive case 
studies. He argues that there is an appropriate “debt capacity' ” determined by the 
probability of cash shortages and the risk preferences of the firm, in view of bankruptcy 
penalties or punitive financing charges should the firm run short of cash. If the notion 
of a firm’s risk preferences were appropriately delineated, this approach would be com- 
patible with the theory' to be developed in Section 5. 

Z Hunt [9] has developed a son of clientele theory based on an extrapolation from 
cloudy held firms, but it depends on being able to maximize equity values, contrary to 
the results of Modigliani and Miller about market effects. Also, he uses Donaldson % 
debt theory' [3] as well as Solomon’s [17]. 



OPTIMAL DIVIDEND POLICY 


135 


5. THE CLIENTELE THEORY OF DIVIDEND POLICY 

Since the valuation of a firm in a perfect market is not affected by its leverage 
and dividend policies, it must be that optimal policies, if any, are determined by 
other considerations. Here \vc pursue the ramifications of the results mustered 
in Sections 2 and 3 to show that individuals in a market rely on corporate 
securities as a means of implementing their insurance strategics (in the sense 
introduced there) and of guaranteeing preferred time-state patterns of income. 

Now in an absolutely perfect market there are no effective ramifications of 
these results, for each individual can, with no loss at all, buy, sell, and issue 
securities to implement his insurance strategy. If, however, there are expenses 
involved in the insurance process, no matter zchat their magnitude , we are in- 
evitably led to a clientele theory of corporate leverage and dividend policies. 
Included in such expenses might be the costs of information, transactions, 
and/or differential tax treatments between dividends and capital gains. If 
these costs arc comparatively small for active traders (i.e., arbitrages) in the 
market, they will not affect the valuation of a firm; yet among the great majority 
of investors they will discourage transactions and induce a reliance on the firm's 
financial policies to implement their insurance strategics. Given any particular 
set of announced or evident financial policies, therefore, a firm will attract a 
set of investors for whom these policies, together with the policies of other 
firms in their portfolios, implement their insurance strategies.! To whatever 
extent the management of a firm is responsible to, or sensitive to the interests 
of, the owners (especially the shareholders, since they hold the vote and the 
returns of bondholders arc prescribed), there will exist optimal financial 
policies.^ 

It might be thought that any set of financial policies is optimal as long as it 
is perpetuated to sustain the insurance strategics of the owners without re- 
quiring excessive transactions. That this is not so, that in fact only certain 
kinds of financial policies can be optimal, is the thesis of the results to follow. 
Subsequent discussion relates the analytical results developed in [20] on the 
form of a Pareto-optimal dividend policy in a highly simplified dynamic model 
to the problem of a clientele theory of optimal leverage and dividend policies. 
It is taken for granted that one condition for optimality is stability of the policies 
over time to minimize transactions; hcncc it is presumed that the ownership 
group is fixed over time. 

As in Section 3, consider a group of individuals (indexed by £=!,...,/) 
who make their decisions in common and share the proceeds jointly. At each of 
several times (indexed by / = T) the group has an amount of capital 

t This proposition is testable in the sense that it is empirically refutable because it 
implies that n change in the firm’s financial policies, or in its productive opportunities, 
will pen crate a larger turnover in its ownership group. This may explain why large 
issues of new securities tend to sell at a discount initially. 

t A clientele theory also results from the first case of market structure, defined in 
Section 4, or in the presence of bankruptcy penalties or punitive financing charges in 
times of cash shortages, from Donaldson’s debt theory (31. 



ROBERT WILSON 


136 

(ct) to allocate between an immediate payout (s®) and reinvestment (ct—s° ( ) 
for one period, the latter at a known rate of return (r t ). Following a decision at 
time the capital at time t + 1 isf 

~ c i~ 1 = 0 + rtff t - s?) + x t , (8) 

where xt is a random variable for which each member has assessed a prob- 
ability distribution that may depend on history. Let be individual z’s share 
of the payout s t ° and assume that he assigns a utility function U { ($i , . . . , j T <) 
for his proceeds of the special form 

« - 1 at 1 > 0, (9) 


where 


Ut { (st { ) = > 0. (10) 

Specifically, each utility function is separable, discounted, and exponential. 
The discount rate afi can be taken as a measure of time preference for income. 
Pratt [15] has called pt i the measure of “risk aversion,” which is taken here 
to be constant with respect to mainly to permit explicit analytical 
derivations. 

Let it be added parenthetically that this model is only illustrative, and no 
claim is made for generality; in particular, the investment structure has been 
drastically simplified. Phelps [14] has developed a similar model for a “one- 
person group” with a richer investment structure^; see also Yaari [21]. 

The goal of the group, we shall suppose, is to select a Pareto-optimal payout 
policy. § With this normative criterion, what can be said about the form of an 
optimal policy? The following results, among others, have been obtained in 
[20]. n 

First, the payout s® at each time t is a linear function of the present disposable 
capital Ct . In particular, the payout is the sum of a proportion of the capital 
and a fixed payment. The latter depends on (a) the time preferences for income 
of the individuals, (b) the weights accorded the individuals in determining the 
Pareto optimum (i.e., the allocation of ownership), and (c) the present cash 
equivalents of future uncertainties (i.e., {.vr;r = £, T}). In the extreme 

case in which each individual's risk aversion is constant over time {pt i=1 P i f° r 
all t) and there is an infinite time horizon (T=co) the proportion of capital 
paid out is simply r*/(l +n). Thus in this case the optimal payout policy is 
composed of a fixed payment, which can be prescribed in advance contingent 
on the state of the world, and a variable proportion of capital which is the 
current earnings rate on investment (after being discounted for the in-process 

t A tilde (~) denotes a random variable. 

i Unfortunately, there are errors in his results; e.g., the derived polity is correct only 
for y = 0 in his notation. 

§ There is no unique policy but rather a family of such policies. 



OPTIMAL DIVIDEND POLICY 137 

duration of one period). The parallel between this poliev and the characteristic 
payout policies of corporations is at least remarkable. f 

The form of a Pareto optimal rule by which the individuals share the payout 
is also interesting. As with the aggregate, each individual shares linearly, re- 
ceiving a fixed payment plus a proportion of the disposable capital. In particu- 
lar, his proportion of the disposable capital is in inverse proportion to his risk 
aversion. 

If these results are in any way indicative of what might be demonstrable in a 
more general development of the clientele theory, wc can conclude the following. 
First, even the weakest of normative criteria, Pareto optimality, leads to pre- 
scribed forms for the payout policy and, to a degree, to quantitative specifications. 
In simple models we can obtain linear payout policies, the fixed and variable 
parts of which can be identified with interest and dividend payments and there- 
fore determine optimal leverage and dividend policies in the context of a clientele 
theory. In more complicated models we would be led to nonlinear policies and 
therefore to investigations of the potential roles of more complicated financial 
instruments. 

Of course, the results presented here abstract away much of reality, and even 
so the hard analytical results are comparatively meager. Nevertheless, they 
demonstrate that there can be a role for a group decision theory in a market 
context and that there are many possibilities for further research — the clientele 
theory of financial policy being among the most important. 


6. REFERENCES 

[1] Kenneth J. Arrow, "The Role of Securities in the Optimal Allocation of Risk 
Bearing ” Rev. Econ . Studies. 31, 91-96 (1963-1964). 

[2] Karl Porch, “Equilibrium in a Reinsurance Market,” Eeonometrica, 30, No. 3, 
424-444 (July 1962). 

[3] Gordon Donaldson, Corporate Debt Capacity , Division of Research, Harvard 
business School, Boston, Massachusetts, 1961. 

[4] David Durand, “The Cost of Capital in an Imperfect Market: A Reply to Modi- 
gliani and Miller,” Ant. Econ . Rrc. (June 1959). 

[5] Irving Fisher, The Theory * of Interest , Macmillan, New York, 1930. 

[6] Myron J. Gordon, “Optical Investment and Financing Poliev,” J. Finance, 18, 
No. 2, 264-272 (May 1963). 

[7] Jack Hirsiilhfer, “Investment Decision under Uncertainty: Choicc-Theorctic 
Approaches,” Quart. J. Econ. % 79, No. 4, 509-536 (November 1965). 

[S] Jack Hirshutifck, “ Investment Decision under Uncertainty; Applications of the 
State Preference Approach,” Quart . J. Econ. 80, No. 2, 252-277 (May 1966). 

[9] Pearson Hunt, Financial Analysis in Capital Budgeting f Director of Case Distri- 
bution, Harvard Business School, Boston, Massachusetts, 1964. 

(10] John V. Ltntner, “Dividends, Earnings, Leverage, Stock Prices, and the Supply 
of Capital to Corporations,” Rev. Econ. Stat 44, 243-269 (August 1962). Sec 
also “The Cost of Capital and Optimal Financing of Corporate Growth,” J. 
Finance , IS, No. 2, 292-310 (May 1963). 

[11] Duncan Luce and Howard Rauta, G ayr.es and Decisions , Wiley, New York, 1957. 

t Note that if the variable payout is identified with dividends, this policy implies higher 

average dividend yields (including share repurchases) among levered firms. 



SESSION III 


ADVANCES IN TECHNIQUES OF MODELING 

Progres dans les Techniques de Modele 
Chairman: G. E. Kimball (United States of America) 




QUELQUES ASPECTS THEORIQUES ET PRATIQUES 
DES PROCESSUS DE COMPORTEMENT ADAPTATIF 


Some Theoretical and Practical Aspects 
of the Adaptive Control Process 

J. Torrens-Ibern 

Escuela Tecnica Superior de Ingetiicros Indusir tales 
de Barcelona et Ingenieros Consul tores, S.A., VEspagnc 


1- INTRODUCTION 

L e domaine conccrne par les processus de comportement adaptatif (adaptive 
control processes) cst vraiment tres large, si bien qu’i! est difficile d’en 
marquer les limites. 

En laissant, ccpcndant, dc cote toutes les generalisations relatives aux 
cas qui ne considereraient pas des conduites rationelles, ainsi que les aspects 
psychologies lies a la theorie de Fapprentissage dans son sens le plus large, 
nous nous limiterons aux aspects mathematiques et economiques des processus 
dc comportement adaptatif, lorsque 1c modcle est aieatoire et on nc connait 
pas, au depart, la loi de probability qui conccrne le phenomenc que Ton 
etudic. 

Unc premiere etape se caracterisera, done, par le rassemblcment dc 
rinformation sur laquclle puisse se baser rationncllement le comportement 
futur. 

Dans certains problcmcs classiqucs, on suppose que Ton nc connait rien 
des lois de probability dc transition de Tetat choisi par la decision prise a Tetat 
que lc hasard produira. Ccs conditions ne se presentent que tres rarement 
dans la realite ou une ignorance complete du deroulement du phenomene est 
peu comprehensible. 

A ccs connaissanccs prealables s’ajoutera 1' information due aux experiences 
rcalisecs ct portant sur les resultats dc la phase aieatoire du processus. 
L’analvse dc cettc information donne lieu a l’elaboration des modeles dc 
decision qui serviront a les automatiscr tandis que lc cumul des informations 
constitue cc que Ton peut appeler du point de vue psychologique Yexpcricnee 
dc Vapprcntissagc. 

La deuxieme etape conccrne Implication du modcle dc decision, e’est- 
a-dirc, la mtsc sur pied des normes concretes dc la derision cn fonction de 
l’etat constate a Tinstant auquel ccllc-ci doit etre prise. II se peut que cettc 
decision soit unique ou bien qu’ellc admette des corrections a la suite defor- 
mations supplemcntaircs revues apres avoir cu un commencement d’cxccution. 

141 



142 


J. TORRENS-IBERN 


Cette deuxieme etape n’est pas separee de la premiere parce que pendant sa 
realisation lc rasscmblement dc reformation continue, ce qui peut ameliorer 
et perfectionner lc modele dc decision. 

Finalcment, il est important de tenir compte du controlc des modeles 
dc decision, et ccci avec un double objcctif. Celui de controler la stabilite des 
lois de probabilitc admiscs a la suite des analyses statistiques entreprises et 
celui de verifier que la structure du modele n’a pas a etre modifiee. 

2. LES PROCESSUS DISCRETS DE 
COMPORTEMENT ADAPTATIF 

La theorie des processus discrcts de comportcmcnt adaptatif cst un des 
devcloppemcnts dc la programmation dynamique qui scmble promis au plus 
brillant avenir. 

On connait Ics applications faites par Bellman ct Dreyfus dc la program- 
mation dynamique a la solution du problcme dc la trajectoire optimale d’un 
projectile vers unc cible mouvante [1], D’autrc part, Howard a etudie les 
phenomencs de caractere alcatoirc lorsque les probabilities dc transition d’un 
certain etat actuci aux etats possibles posterieurs sont connucs ct independ- 
antes du temps, e'est-a-dire dans lc cas des processus markoviens [2], 

Ccpcndant, dans lc domainc des processus de comportcmcnt adaptatif, 
il cxistc d’autres cas ou ccttc information fait defaut; on sait que Ton se trouve 
cn face d’un phenomenc alcatoirc, mnis on nc connait pas les probabilites de 
transition de Petal actuel a chacun des etats possibles suivants. Il faut, done, 
rccueillir l’information qui doit permettre la connnissance dc ces lois de prob- 
abilite pour s’en servir aussitot, afin dc prendre les meilleurcs decisions 
possibles sans nttendre une connnissance parfaite de 1’alcatoricte du 
phenomena. 

Considerons un processus stochastique caracterise par l’cxistcnce de 
plusieurs etats possibles X\ , . Xj{ et appelons X( celui qui s’est produit 
immediaternent avant l’instant i; soient Y\ , Y 2, Yl les decisions pouvant 
etre prises, dcsquelles y\ cst ccllc qui cst choisie a 1’instant i. Alors a la suite 
dc l’etat X( ct de la decision 37, lc hnsard produit une transition fonction 
alcatoirc dc X( ct 37, pour laquellc on suppose que le processus conduit a 
I’etat initial dc 1’instant 1 + 1: 

•VH1 =//(avOV, -/) 

Ccs conditions ont unc consequence economique, fonction des trois vari- 
ables, dc caractere additif, qu’il s’agit d’optimcr pendant la periode comprise 
entre l’instant 1 et le final N: 

Rx{X, 7, Z) =fi(xi , yx , zi) 4 bfi(x h y { , Z() 4 +M*n , yx » **') 

Pour ccla, d’apres Bellman, si nous appelons 7q( X, Y, Z) la somme des 
valours csperces en suivant la politique optimale dc 1'instant i jusqu’a N y et 
F i{ -i(X, Y , Z) de 1’instant /+ 1 a A r , on aura: 

Fi(X\ Y y Z) - opt[/u[/ ( (A V , y h *,)] + F M (X y Y y Z)] 

v 



ASPECTS DES PROCESSUS DE COMPORTEMENT AD\PTATIF 


143 


Lc calcul iteratif, de proche cn prochc dc cettc expression pour / = A\ 
;V— * 1 , . 2, 1, permet de rccherchcr les decisions optimalcs pendant toutc 
la durcc en question. La difriculte que Ton rencontre dans sa resolution vient 
dc ce que dans 1 Expression : 

i 

Ez[fi(x ( , yt , )] = 3 pis f y f , Zj) 

les probabilites p;j — pr(r* Zj | a*/ , 37) ne sont pas connues ct qu’il faut les 
estirner au moycn dcs resuhats obtenus aux phases preccdcntcs. 

2 . 1 . Estimations Bayesiennes des Probability 

II apparait raisonnable d’estimer les probabilites pij dc transition de Tetat 
JYf, apres la decision 5 \ # a l’ctat Xj au moycn du theoreme de Bayes et, en 
concrctisnnt davantage, au moycn de la formule de Laplace-Bavcs. 

On sait que la resolution de ce problcme est basee sur lc raisonnement 
suivant: 

Considcrons m — 1 urnes, chacune avec in boules, dcsquelles respective* 
ment 0, 1, 2, . m sont blanches. Unc urnc est choisie au hasard; on y 
prelcve n boulcs d'unc {3900 non exhaustive (avec reposition) ct il s’agit 
destimer la composition de lume choisie cn considerant Ie nombre r de 
houles blanches rencontrces dans Ie total n. Si lc nombre des urnes augmentc 
indefiniment ct nous appelons xc la proportion de boules blanches dans I'urnc 
choisie la probabilitc elementaire a priori dc ce choix est dP(zc), La probabilite 
du prclevcment effectue avee Phypothcse d’une proportion xc est: 

C 7 , r tc r (l — xc)”~r 

Lc theoreme dc Bayes donne commc probabilitc a posteriori du choix 
dc I’urnc dc proportion re, apres simplification: 

re r ( 1 — rr) n_r dP{tc) 
dPir =■ “71 

I rc r (l — xc) n ~ r dp(xc) 

*0 

L’cstimation dc la proportion xc scion la theorie classiquc consistc a 
determiner son csperancc mathematique: 

.1 f Xl T ^{\ — x)*-rdp(xc) 

£[«■] = | * = A 

| « r (l -v)”-rdp{ K ) 

*(l 

Dans lc cas ou la probabilite a priori est uniformcmcnt repartie 
dP(tc) = d:c ct «* = £[rr] = 

On demontre que cctte estimation est sans biais ct qu’cllc a Pcfficacite 
maximale et la variance minimale [ 3 ]. II est, done, evident que cctte estima- 
tion est, at rr toy awe, la meillcurc. 



144 


J. TORRENS-IBERN 


Cependant dans ce cas, on n’a pas a tenir compte d’aucune repetition des 
essais dans les memes conditions initiales et la condition d’etre la meilleure 
estimation en moyenne a peu de sens. En effet, la decision a prendre a chaque 
etape doit etre optimale, compte tenu des resultats acquis; il semble prefer- 
able de choisir l’estimation a laquelle est attachee la plus grande probabilite 
c’est-a-dire choisir le mode de la fonction de densite de probabilite au lieu de 
la moyenne. On pourra done la definir comme l’estimation du maximum de 
vraisemblance. 

Si nous considerons, comme auparavant, que la probabilite a priori est 
uniforme, nous ecrirons: 


, v dp w wr(l-w)n-r 

/■w= 


f w r {\ —zv) n - r dw 


Comme le denominateur a une valeur constante qui ne depend pas de la 
proportion zv choisie, le maximum de la densite de probabilite coincidera 
avec le maximum du numerateur. II suffira, done, d’annuler la derivee de 
celui-ci par rapport a w: 


rw x ~\\ — w) n ~ T — (?i — r)(l — zv) n ~ r ~ l zv r = 0 


et de prendre comme estimation de w la racine de cette equation: 



n 


Les deux estimations anterieures w* et w tendent Tune vers Tautre pour des 
valeurs tres grandes de n et de r. Toutefois, en considerant que la decision a 
prendre, a chaque instant du processus, est une decision unique, il parait 
preferable de choisir Testimation dont la densite de probabilite est maximale. 

3. LES PROCESSUS CONTINUS DE 
COMPORTEMENT ADAPTATIF 

Dans ce cas, le processus stochastique debouche sur un etat caracterise par 
une variable continue x. A un instant i, la valeur mesuree de cette variable 
est X { ; face a cet etat, une decision est prise que nous designerons par y{. 
L’evolution aleatoire du processus amene une consequence que nous appele- 
rons zt\ nous dirons aussi que ce resultat est fonction aleatoire de l’etat X{ 
ct de la decision yt . Cependant, en realite, ce que peut donner l’ensemble 
{xt , y() est la fonction de densite de probabilite de z: 

p{z, xi , yt) dz 

et aleatoirement il en resultera une consequence z\ laquelle conduira a Eetat 
xi+i ; le resultat economique sera aussi fonction de ces variables. 

Proposons-nous un cas concret : soit xi le volume mensuel de ventes d’une 
entreprise, yi la somme destinee a publicite pour le mois i par la meme entre- 
prise; zi est l’impact de cette publicite lie au volume des ventes du meme 



ASPECTS DES PROCESSUS HE COMPOKTEMENT ADAPTATIF 145 

mois, Icqucl sera mesure par lc volume dc ventes du mois suivnnt, e’est a dire 
par \(±i . 

Cc volume dcs ventes .vj*i s’cxprime comme le resultat d’une fonction de 
Petal antericur .v,- , dc la decision v,*. de Pimpact aleatoire inconnu r f de “Petal 
de la nature** 9{ * . . . 

H x 'i* J’l.-i.fy*..-) 

Dc ccs variables, nous connaitrons evidemment .Vi; 6 ( ne sera pas connue 
inais on peut admettre qu’ellc pourra ctre estimec au moven dc quelqucs 
indices ou indicatcurs qui, cux seraient connus: 

Dans lc cas que nous considerons. ces indices pourrnient ctre Pindice de 
Pactivitc industriclle, Pindice de la circulation monetairc, Pindice dcs prix, 
etc. 

II s*agit done de choisir Xi a partir d’un certain critere, lie a la loi de prob- 
ability dc inconnue, ct dc la fonction economique dcs resultats. Quclque 
son ce critcre, it y a lieu de supposer une ccrtatne specification de la loi dc 
probability de a* ct une ccrtaine forme dc liaison dc celle-ci avee les variables 
d’etat ct de decision. Par cxemple, on pourra admettre initialcmcnt une loi 
de probability normalc ct une liaison avec les autres variables suivant une 
regression lineairc cn ce qui concemc les valours moyennes dc r*, la variance 
etant constants 

D’un autre cote, il faut tenir comptc, pour la decision a prendre, du critere 
economique envisage. En cfTct, il est possible d’admettre comme meilleur 
choix cclui qui donne lieu en moyenne au plus fort gain, et ceci en tenant 
compte, soit d’un bilan economique classiquc, soit d’un bilan ou intervicn- 
draient dcs coats dc soldc ct de penuric. Il cst, encore, possible d’admettre 
comme meilleur choix celui qui a une plus grande probability dc donner un 
gain clove. On pent, finalement, prendre comme critere cclui d’avoir un risque 
inferieur a a d’un gain infericur a une valcur choisic. 

3.L Etude de la Loi de Probability 

Une partic du problemc conccrnc {’analyse ct specification de la loi de 
probability de r*. Dans lc cas lc plus general, il s’agira d’une loi de prob- 
ability h multiples dimensions qui s’exprimcrn: 

/ (e, x % y, 6) dz dx dv dO 

Une hypothesc simplificatriec asscz plausible peut ctrc que la loi dc Zf 
soit normalc et que les autres variables nc soient pas nlcatoires. Parmi ccs 
variables, incluses dans “l’etnt dc In nature” pourrait sc trouver lc temps; il 
cst aussi possible que la variable z soit independante dc la decision prise. Tel 
est lc cas que nous avons etudie aillcurs pour resoudre le problemc de !a 
“Gestion dc la production pour une demandc aleatoire saisonniere *' [4], 

Ici, il s’agit dc prevoir la demande pour realise r unc production appro- 
price a une telle demandc. Les donnees sont Ics valours de la demande pour 



ASPECTS DES PROCESSUS DE COMPORTEMEVT AD \PTATIF 147 

Dans le premier cas, il faudra determiner Tesperance mathematique de 
2 , laquellc s’cxprimcra: 

m=f 
J - 00 

Dans 1c cas Ic plus general, on considcrera que la \anabie aleatoirc-rcsultat z 
cst lice au moyen d’une regression lineaire multiple au\ facteurs qui inter- 
\iennent dans le phenomene: 

E[zi] —ao + ai x + aoy — a^t — a^h - * * 

a \ ryt = <7- 2 (l — R^j, ) 

II s’agira, alors, de choisir la decision yi qui donnera le resultat optimal, comptc 
tenu des \aleurs particuheres qui ont ete prises par les autres \anables. 

3.3. Choix de la Decision Optimale 

Pour le choix de la decision optimale, il est possible de tenir compte de 
la \alcur esperee de et aussi de la \aleur a laquelle cst lice la \ rai semblance 
maximale. Dans le premier cas, on aurait: 

m = \ Ma, z-f(z,y, X ,0)dz 
- 00 

ct dans le second, il faudrait trouver la racine de Pequation* 

Zf(~,x,y, 6) _ Q 
cz 

unc fois appliquees au\ autres \ariables .v, y, ( 9, les \aleurs correspondantcs. 
L’elcctjon de Pun ou Pautre critere peut sc baser sur les conditions duplica- 
tion du modeler si le phenomene est repetitif la solution de Pcspcrance mathe- 
matique doit etre preferee, tandis que s’ll s’agit d’un phenomene dont les 
conditions de realisation \arient ou qui ne doit donner lieu qu’a une decision 
unique, la methode du maximum dc eraisemblance est plus appropriee. 

Rcmarquons que si la loi de probability de z est normalc, comme nous 
l’a\ons suppose, les deux solutions coincident. 

Cependant, ces solutions nc sont pas acceptables dans la majorite des cas. 
Cn clTct, s’il s’agit, par exemple, de la precision dc la demande, connaitre la 
valcur mo^enne ou la valeur la plus probable ne signifie pas que ces \alcurs 
doi\cnt etre prises en consideration pour les prendre comme des quantiles a 
produirc. Du point de vue economique, il faudrait affecter les differcntcs 
e\cntualites de z a\ec les couts correspondants (cout de fabrication, cout dc 
stockage, cout de penurie, etc.) pour obtenir comme resultat le cout total 
mo\cn cn fonction dc la decision y . L’annulation de la demec de cc cout en 
fonction des y donne la solution desiree: 

|; J C[y t zy = -f{*,y,z, 0) dz — 0 

CO » -) representant les corns moxens par unite de r, les que Is seront fonction 
dc z ct v. 



148 


J. TORRENS-IBERN 


L’autre point de vue permettant de choisir une decision peut etre celui 
de la securite. II serait celui dont on tiendrait compte si Ton voulait avoir 
seulement un risque a tres petit de rupture de stock a la suite d’une demande 
exceptionelle: 

1 — « = f(x, y, Z, 6) dz 
*0 

II ne donne pas une solution d’optimation puisque a chaque decision possible 
correspond pour 1’integrale une limite differente (ce que nous avons souligne 
en ecrivant que la limite est fonction de y). 

Pour que ce critere soit pratiquement utilisable, il faudra que la loi de 
probabilite soit normale ou qu’il soit possible d’exprimer z(y) en fonction de a. 


4. LE CONTROLE DE LA LOI DE 
PROBABILITE DU PROCESSUS 

La connaissance, si rudimentaire soit-elle, de la loi de probabilite du processus 
permet de prendre des decisions avec Pespoir qu’elles soient les meilleures dans 
Petat concret de Pinformation recueillie. 

II ne suffit pas, cependant, d’accepter sans controle que les resultats de la 
phase aleatoirc sont d’accord avec le modele admis. II faut verifier, au fur et a 
mesure des echeances, que les ecarts entre les previsions et les realisations 
peuvent etre le fait du hasard. Les causes qui peuvent determiner une dis- 
cordance significative entre le modele et la realite sont tres variees et il s’agit 
d’en deceler les effets au plus tot pour ne pas maintenir un modele qui severe 
inapproprie. 

Il est vrai que le caractere dynamique des previsions admet qu'elles soient 
corrigees sans attendre le moment de la realisation, des que de nouvelles 
informations sont arrivees, mais malgre tout le modele n’est pas uniquement 
constitue par les parametrcs de la loi de probabilite qui se precisent au fur et a 
mesure, sinon aussi et principalement par sa structure. II se peut que des le 
debut cette structure supposee nc soit pas appropriee, et il se peut aussi que le 
temps modifie une structure qui s’accordait primitivement au modele. Dans 
le cas concret de l’etude previsionnelle de la demande a partir de donnees 
globales, on constate en effet que le controle a posteriori permet de deceler; 

— Les changemcnts qui se sont produits dans le modele de demande ces 
clients. 

— Les cas dans lesquels les tendances prevues ne sont pas conformes 
a la realite. 

— Des modifications concemant les conditions structurelles de la 
demande. 

L’inconvenient principal de ce controle vient de ce que les graphiques 
classiques que Ton emploie dans la surveillance des processus industriels de 
fabrication, n’ont pas une efficacite suffisante dans ce cas ou les obsen'ations 



ASPECTS DES PROCESSUS DE COMPORTEMENT ADAPTATIF 149 

sont uniques. La solution peut etre trouvee dans Tutilisation des graphiques 
dc controle par cumul d'observations, lesquels jouissent dune cfficacite bien 
plus grande que les graphiques classiques. 

Or ces graphiques sont actuellemcnt tres au point ct 1’on peut les 
employer en toutc connaissance [5], L’application a ce cas concret d’cchantil- 
lons unitaires oblige cepcndant a certaines considerations specialcs, concretc- 
ment ici il n f cst pas possible de choisir les conditions qui concerncnt une 
qualite RQL avec son risque /3, mais la valeur qui determine cette qualite cst 
un resultat que Ton doit calculer en fonction des autres conditions [6]. II faut, 
notamment, tenir compte des couts lies aux erreurs destination dc I’esperance 
mathematique de z (valeurs cstimees par defaut ou par execs) et les risques 
corrcspondants aux memes erreurs dans Implication des tests d’hypothesc. 

5. REFERENCES 

[1] R. E, Bfllman and S. E. Dreyfus, Applied Dynamic Programming, Princeton 
University Press, 1962, Chapitrcs 6, 8, et 9. 

[2] R. A. Howard, Dynamic Programming and Markov Processes, IW.I.T, Technology- 
Press and Wiley, New York, 1960. 

[3] R. Fortit, "Calcuf dcs Probabilites/' C.M.R.S , 222-231 (1950) 

[4] J. Torrens- Ibcrn, " Gestion de la ProduccuSn para una demanda aleatona 
cstncional,” Cuadernos de Estadistica Aplicada c Investigacwn Operativa , 4, Fasc, 
3, 1965, Barcelona. 

[5] J. ToRRENS-lBERN, “Lcs methodes statistiques de controle dans les processus 
industries continus.” Revue dc Statistique appliquee (Parts) 13, No. 1, 65-77 (1965). 
“ Lcs graphiques de controle par cumul d’observations it droites hmitcs parallclcs," 
a paraitre dans la Revue Beige dc Statistique et de Recherche Qperatxonnclle (Bruxelles), 
7, (1966). 

f6] J. T0RRFNS-Ib FRX, “Lc controle statistique dcs modfclcs de gcstion ct decision. ** 
Communication a la 35e Session de Plnstitut International de Statistique, Belgrade, 
September 1965. ** Gobicmo Cientifico de la Empress,* 1 Alta Direcciort (Barcelona), 
1, No. 4, 19-26 (1965). 


SOME THEORETICAL AND PRACTICAL ASPECTS 
OF THE ADAPTIVE CONTROL PROCESS 

RESUME 

The field of adaptive control processes is wide; its limits arc difficult to pre- 
scribe. In this article we shall try’ to detennine rational economic behavior 
in the face of an ergodic system with an unknown probability distribution. 

At the time a decision is to be made information relating to this probability' 
distribution must be gathered in order that we may act on judgments more 
closely related to reality. 

A distinction is made between a discrete variable, which can be dealt with 
in dynamic programming by using a Bayesian probability' estimation, and a 
continuous variable, such as the demand prediction for production schedul- 
ing, which can be handled by regression equations. The problem of specific 



150 A, CHARNES, M. J. L. KIRBY, AND W. M. RAIKE 

conditions under which the optimal decision can be made (unbiased estima- 
tion, maximum likelihood, cost or risk considerations, etc.) is taken into 
account. 

Finally, the problem of probability distribution control is stated a 
posteriori to improve the distribution if the known results make it necessary. 
Because of the particular character of this control, of necessity made in 
discrete samples, control charts of cumulative observations are more efficient 
than the classic charts of independent observations. 


ZERO-ZERO CHANCE-CONSTRAINED GAMES 

Jenx Zero-Zero a Contraintes par le Hasard 


A. Charnes, f M. J. L. Kirby, J 
and W. M. Raike| 

United States of America 


1. INTRODUCTION 

In many of the situations that can be modeled as two-person, zero-sum 
matrix games the entries of the payoff matrix may be known to the players 
only to the extent of knowing them as random variables with specified prob- 
ability distributions. Of course, many hypotheses can be made concerning 
the degree to which the players can or cannot influence these probability 
distributions, and many hypotheses can be made concerning the information 
that is available to each player at the start of the game or becomes available 
to each player as the game progresses. The nature of these hypotheses 
depends critically on the particular conflict situation being modeled. There- 
fore the development of theories that employ the normal form of a game 
(extended to include the probabilistic payoff matrix case), the extensive form 
of a game, or a form in between these two in which, say, one plays a sequence 
of games whose payoff matrices are stochastically related, are all of interest 
because of the different situations one may wish to model. 

This article is the first in a sequence of papers directed toward extending 
game theory to encompass randomness in the payoff matrix. Here, we present 
some initial results on the simplest of our models. Subsequently we consider 
more complicated models that include the effects of increasing a player’s in- 
formation during a time sequence of plays of the game and the effects of a 
player’s inability to assume with certainty the actual implementation of his 
decisions. All of these notions we fit into a framework we call u chance-con- 

t Northwestern University. 

t University of Chicago. 




ZERO-ZERO CHAN CE- CONSTRAINED GAMES 151 

strained games” because the notions of chance -constrained programming! 
and its theory form an essential part of our development. 

Our first model is concerned with finding zero-order decision rules for 
a zero-sum game. We call it a “zero-zero chance-constrained game.” We 
begin with zero-order rules because of their relative simplicity and because 
they have been the bases for generalizations to multiperiod (or multistage) 
chancc-constrained models J that use the device of conditional chance- 
constraints. 

Wc begin by extending the notion of a minimax or maximin strategy 7 to 
include the case of random entries in the payoff matrix. We do this by gen- 
eralizing, in a chance-constrained programming fashion, the dual linear pro- 
gramming problems that are equivalent to a two-person, zero-sum game when 
the payoff matrix contains only deterministic elements. § Then, assuming that 
the random elements in the payoff matrix are independent random variables, 
wc obtain the deterministic equivalent nonlinear programming problem for 
each player. These problems are gamelike in structure and lead us to the 
conclusion that, in general, an optimal strategy for each player results in his 
playing as if he were confronted with a different deterministic game from that 
of his opponent. 

We investigate properties of these gamelike nonlinear programs and obtain 
various results concerning tightness of inequality constraints and assumption 
of the optimum of the objective function on the boundary 7 of the set of feasible 
solutions. Wc show* that, in general, for any feasible strategy of the mini- 
mizing player the “value” of the stochastic game will be greater than the 
“value” of the game to the maximizing player, even if the maximizing player 
uses his optimal strategy. Thus we obtain a theorem similar to knowrn duality* 
results in mathematical programming. 

Wc then consider several special cases that lead to pure strategy solutions 
for each player. Here wc obtain the surprising result that it is possible for 
the stochastic game to have a saddle point even though both players act, in 
general, as if they were facing different deterministic games. 

Next, w*c formulate and discuss the extension of the concepts introduced 
to situations involving multiperiod games in which each player accumulates 
information about his opponent and the payoff matrix as time progresses. 
Wc close the article with some illustrative examples of stochastic games — 
including one w*e w*erc unable to repress: a “007” game (zero-order, zero- 
sum, value seven). 

2. FORMULATION OF THE MODEL 

A tw'o-pcrson, zero-sum matrix game consists of an m X n matrix A in which 
the tjth element d(j gives the payoff from player II to player I when player I 

t See [1] and [2], 

: Sec [3], [4] and [5]. 

§ This generalization is accomplished in such n way that the chance -constrained 
problems reduce to the usual linear programming problems when the payofT matrix is 
completely deterministic. 



152 


A. CI1ARNES, M. J. L. KIRBY, AND W. M. RAIKE 


plays pure strategy i and player II plays pure strategy j. In solving such a 
game we seek an ;// vector p* T — (/>* , .... p*,) and an n vector q*T 

— (9* i • • • > 9* > • • • i 9*) w 'dr 2\ - i/h = P* Si 0, and 2,'j i9* = 1> 9* SiO, and 
such that 

w n r/i « m h 

2 2 2 2 2 Ptotiqj 

t-l i-1 i-l J-l t=l j»l 

for any other vectors />, q with 2T-iPt — 1. Pi 0, and 2"= i9; = 1, 9;^;0. 
We call />*, q* the optimal mixed strategy for players I and II, respectively. 
It is well known that the vectots />*, q * exist for any matrix A. 

Another way of stating the problem is to say that player I, whom we will 
often icfer to as the maximizing player, seeks the m vector p* that solves 

max min p T Aq m 

v <1 

subject to p T c m = 1, />;> 0, and q T c n =1, ?;>0, where ci is the k X 1 vector, 
all of whose elements arc +1.' Player II, the minimizing player, seeks the 
7i vector q* that solves 

min max p T Aq (2) 

<i v 

subject to P T Cm = 1, />i> 0, and q T c n = 1, <7^0. Each of these problems can 
be written as a linear programming problem/)' Problem 1 is equivalent to 
max S subject to 

p, a 


or max 5 subject to 


min p T Aq^>8, 

p T C,n = h 

1, 


q T C n 


P^ o, 

9^0. 


(3.1) 


2 Pi n <) ^ §> 


; = 1 


in 

2 pt — h 

i«l 

pi^: 0, i = m. 

Problem 2 is equivalent to min p subject to 

<7,P 


(3.2) 


ot min p subject to 


max p T Aq<,p t 

p T C,n ~ 1 , 

q T c„ = 1, 


a 

2 a uqj^p< 


p^ o, 

9^0, 


(4.1) 


2 91 = 1. 

J** 1 

91^0. j =1, ...,k. 

1 See [6], Vol, II, Chapter 19, pp. 750-757. 


(4.2) 



ZERO-ZERO CHANCE-CONSTRAINED GAMES 


153 


When some of the a /; arc random variables, the concepts of a maximin 
strategy p * and a minimax strategy q* must be modified. One possibility that 
suggests itself immediately when the atj are independent of the p, q is to re- 
place the random entries in the payofT matrix with their expected values. 
The objective of the maximizing player is then 

max min p T E(A)q. 

p Q 

This formulation, however, is not so flexible or so closely tied into play as \vc 
should like. We develop instead the theory for a much more extensive concept 
that subsumes this expected value objective as a special instance. 

To be specific, wc consider the situation in which two opposing players 
arc to play a zero-sum game in normal form so that each is required to select 
an optimal mixed strategy; but, although each player knows the alternatives 
available to himself and his opponent, the outcome (or payofT) a//, when 
player I plays pure strategy i and player II plays pure strategy j\ may be ran- 
dom, since some elements of the payofT matrix A may be random. We assume 
here that all random variables in the payofT matrix are mutually independent 
and that each player knows the marginal distribution of every random element 
in A. 

Moreover, the players must select their optimal mixed strategics before 
any observations arc taken on the ay. Thus p* and q* arc to be “determin- 
istic” vectors and are not explicit functions of the ay ; p* and q * will, of 
course, depend implicitly on the a y and their distributions because of the 
nature of the problem. In the terminology of chance-constrained program- 
ming p * and q* are to be zero-order decision rules. j* 

Under these circumstances the question is what criteria the players might 
reasonably use in order to choose their optimal strategics. What is the mean- 
ing of the word optimal in this context? For player I the objective wc arc 
considering is that of selecting a mixed strategy />, which maximizes the total 
payoff S that he can attain with at least probability a, no matter what 
strategy player II chooses. The probability a is preassigned by player I; it 
is not known to player II. In formal terms player I wants to solve max 8 
subject to Ptd 


P ( min p T Aq ;> S) ;> a, 


P T C, n = 1, 

/>2> 0, 

(5.1) 

q T Cn = 1, 

or max 5 subject to 

7 >: 0 , 


r(. 



w 

i ’«• 1 


(5.2) 

pi 

/— 1, .... m, 



t For a further discussion of zero-order rules see [3] and [7], 



154 


A. CHARNES, M. J. L. KIRBY, AND W. M. RAIKE 


where the P operator means that we compute the probability by using the 
joint distribution of all the random variables involved in A . 

This objective is a direct extension of the usual maximin objective of 
deterministic games which we have already discussed. It can be seen from 
(5.2) that if all are deterministic! then (5.2) reduces to (3.2) for any a>0 
since if a and b are any two constants P(a > b) > a > 0 if and only if a > b. 

The corresponding objective for player II is to select a mixed strategy q 
which minimizes the total amount p that he pays his opponent with prob- 
ability /3 or more, no matter what strategy player I chooses. Player I is not 
informed of jS. Mathematically, player II’s problem is min p subject to 

Q,P 

P(max p T Aq < p) > j3, 

p 

p T e m = 1, 
q T e n = 1, 

or min p subject to 

i 

( 6 . 2 ) 

y=i, 

The reason for our choosing chance constraints in (5) and (6) rather than 
some other modification of (3) and (4) can be explained. We assume that 
player I does not seek a strategy that will be maximin all the time. This may 
happen for several reasons. For one, it may not be possible to find a maximin 
strategy for all possible values of the random variables (e.g., it is not possible 
if in every row and column some random variable is normally distributed, 
for then ay = oo is possible). Even when a maximin strategy is possible it 
may be too “expensive” to operate such a strategy. Furthermore, in a great 
many competitive situations the two players do not compete directly with 
each other all the time but only when certain events deviate significantly from 
their normal pattern. 

This being the situation, it would seem reasonable to seek a maximin 
strategy only a certain percentage of the time and not all the time. Finally 
note that if his opponent engages in some nonoptimal habits J of play player I 
may well be able to do better than maximin at least some positive fraction 
cf the time. 

As already stated, the probabilities a and are assumed to be specified by 
the players in advance of the plays of the game and each alone knows 

f Note that this possibility has not been ruled out by our assumptions, for a constant 
can be viewed as a random variable that takes on one value with probability one an 
all other values with probability zero. 

t See, for example, [6], Vol. II, Chapter 20, p. 768 ff. 


i- 1 


P> 0, (6.1) 

?> 0 , 



ZERO-ZERO CHANCE-CONSTRAINED GAMES 


155 


his specifications. Because it does not significantly affect the mathematical 
treatment of the problem and, in fact, is of interest in certain models, we Mill 
permit the a, /3 in the constraints of (5.2) and (6.2), respectively, to be 
different for different j and different i. Thus the chance constraints are of 
the form 


and 


>0C;, 


p (jL°n<]i<pj >Pi . 


j = 1 «> 


i — 1 , m. 


in (5.2) 


in (6.2) 


3. THE DETERMINISTIC EQUIVALENT PROBLEM 

We now turn to the problem of finding deterministic nonlinear programming 
problems that are equivalent to (5.2) and (6.2). To begin with, we have the 
following 

Lemma L For any pi , z = 1, . . . , m and o such that i p i = 1 , pi > 0, 

( 7 ) 

;/ and only if 

p| U jcxi: ai}(aj)>y { /, all ijj >a/, 

r rhere 

G S = |yy . i — Jpi ytj ^ s| (7.1) 

and cj is a point in the sample space of the aij. 

Proof. Wc show that i pi a i/(<ju) 5* §} = : 

for all i) ™ So as sets. Clearly So is contained in Si, for if tffj(oj) >; yf/ all 
i t for some co in So and y// in G/ f 2H - 1 pi a ii[u>) 2> Z!-z PiYi) 8 by definition 
of G/; thus is in . Xow suppose w is an element of Si . The choice of 
ytj =r aij(a>) then guarantees that these yij are in G/, so that cj is also an ele- 
ment of So. Hence S\ is contained in S*, completing the proof. 

Because condition (7.1) is unwieldy, the follov inc two lemmas are devoted 
to establishing simpler conditions, one of which is sufficient for (7) to hold and 
the other of which is necessary. We indicate the special circumstances in 
which one or the other of these reduces to (7.1). 



156 


A. CH ARNES, M. J. L. KIRBY, AND W. M. RAIKE 


Lemma 2 [sufficient condition ]. For given S and pi, i= 1, . , m, if there 
exist ytj , z = 1, , . . , m such that 


FI P(av>Yij)>ctj, 

i — 1 

rrt 

Hpiyv>8> 

t = l 
r/i 

^pi = h Pl>O t l — h 

t=i 


(7.2) 


/Aefl S tftzJ pi, /= 1, . za satisfy (7). 

Proof We are assuming that ay are independent random variables. 
Hence 

m / \ ( m \ 

n -P(r» < fly) = < fly]) < P bQ i f^yy <ptau]j, 

since P(piyij<piaij ) is equivalent to P(y{j<ai] ) when pi>0, but, when 
pt = 0, P(y// < <zy) < P(0 < 0) = 1 . Hence, if TI7L 1 ^(yy < fly) > , then 

P (p[pt yy < Pi ay] j > • 

Therefore 

( rn r/i \ 

f-piw — Xp °y) ^ 

If we also have 2i=iP*7tf> S, then we have P(S< 

Corollary. (7) /V a necessary condition that there exist yyf to satisfy (7.2) 
(a) /> zs # />wre strategy (i.e., />/* — 1 /or 5cwze k and all other pi = 0) or ate 

Proo/. 

(a) P(YiLipiO^>b) — so defining y*/ = S and yy = — co for 

ensures that (7.2) holds. 

(b) P(2<Li^f«y>S) = l implies 2<^i pi a it(<*>)^.& almost everywhere. 

Thus the definition of yy = ess^inf #y(co) > — co is valid; these yy certainly 
satisfy (7.2). 

Lemma 3 [necessary condition ]. 7/" S and p ( ,i=l , . . . , ozV/z 2?=i ^ 

/>* >: 0, all i, satisfy (7), then there exist yy , z — 1, . . . , zn such that 


i — n p(^// < 70 ) > 

1 = 1 

771 

’Zpiya>8. 

f - 1 


(7.3) 


t Here and in the sequel statements in regard to existence of quantities are under- 
stood to refer to the extended real numbers. 



ZERO-ZERO CHANCE-CONSTRAINED GAMES 


157 


Proof, 1 — nr~i Pfat! < Yii) for some /} — 4 S 3 . 

Suppose oj is an element of 5i={o>: TJLj Pi a iA°>) > 8 ). Then certainly 
;> 8 for some A% since p/> 0 , all i, and 2 ^ 1 />/ = so that we S 3 . 

The choice y\) = ^h(cu) then ensures satisfaction of (7.3). 

The conditions for player IPs problem that are analogous to (7) through 
(7.3) arc 

ft < pj ^ A . 1.91 = 1 . ft 1> 0 , for all j, 

P|U{«u: all/}J>/? ( , 


( 8 ) 

( 8 . 1 ) 


where 


Hi = j = 1, ...» ti: T <£i;<7;< p}. 

i-i 

El ^f/ < <£//) , 

j-i 

n 

S'/’Uft^P. 

j-i 

n 

2 Qj z== i * (?/ ^ ;«i 

i-l 

j W* 1 


( 8 . 2 ) 


(8.3) 


Using these conditions, we obtain the analogues of Lemmas 1 through 3 
for player II. 

We propose now to investigate properties of solutions to the problems 
whose constraints arc (7.2) and (8.2). 

Put P(y(j <rt//) = l — T</(y, 7 ), where F{j(*) is the distribution function of 
0 (j. is assumed to be left continuous. Similarly, P(<f>ij 2> tf *;) = 

lienee the problems we investigate are max 8 subject to 


2 />/ = !> * = 1, w, 

1 

7>J 

2 pryr/S: S> / = 

1 


( 9 ) 


IT[I- F i}(yv)] 5: «; » 

• *'i 


and min p, subject to 


?/>°> 

i-3 

75 

S^uft^p- ; 

j” 1 

n f<i{4'u) s ft* < 


= 1, .... ft 

7=1. 

= 1, — , m. 

» = m. 


( 10 ) 



158 


A. CHARNES, M* J. L. KIRBY, AND W. M. RAIKE 


Observe that for any specified values of yy, i = 1, n such that 
UU [1 — FiliVi})] >*], j = l, •••, n, (9) is similar to (3.2). Hence (9) can 
be viewed as a problem of finding a maximin strategy for a deterministic game 
whose yth payoff matrix element is yy . Thus player Fs problem (9) can be 
stated as a problem of finding an optimal matrix game F = (yy) from the 
set of all r such that 


m 

n [1 — -Ftf(yy)] > <*; > j = l,...,n 

i*= i 

and finding an optimal mixed strategy for the optimal game. Let 

Ti = Jr = (ytj): nil — Fij(yi})] «/ » / = 1, — , « j . 

Then T\ defines the set of feasible games for player I. We will let T = (y^) 
denote an optimal game, /? = (/h , . pi > . /%) be an optimal maximin 
strategy for T, and 8(F) be the value of game F. We shorten the latter desig- 
nation to 8 when no confusion results. 

Similarly, we let 

72= (<b = (<£#): n Fi](<f>ij)>Pu i = li • ••> tn 
l J - 1 

so that T 2 is the set of feasible games for player II. O, q , p are defined as the 
optimal values of O, q — (qi, . qj , . 7?*), P f° r (10). 

We now investigate various properties and special cases of (9) and (10). 
In particular, we are interested in seeing when 8—p } so that our original 
stochastic game has a value. We also give some sufficient and some necessary 
conditions for p and q to be pure strategy solutions. 

We will have occasion to refer to the problems max 8, subject to 

m 

1 — T\Fi]{ya)>Kh 7 = 1 , ( 9 ') 

i=i 

m 

i — 1 

m 

2pi = h pt> o, all i, 

1=1 

and min p, subject to 

1 — fl [1 — FtM „)] ^ j3, , i = l m, (10') 

j=i 

n 

2 < f>i] ( i}^P’ * = i, 

i=l 

71 

= ©>°. all;. 

i=i 



ZERO-ZERO CHANCE-CONSTRAINED Gl-VES 


159 


Note that the constraints of (9') and (10') are ghcn by (7.3) and (S.3), 
respectively. The relationships between problems (5.2). (9), and (9 r ) arc given 
by Lemmas 2 and 3. Hence any p and 8 satisfying (9) (for some set of yy) 
also satisfy (5.2), and p and 8 satisfying (5.2) must satisfy (9') for some set of 
•/if; analogous remarks hold for (6.2), (10), and (10'). 


4* PROPERTIES OF T AND <5> 


Theorem L The constraints 37-1 /'/ 7 * 7 >S, / = 1, n, rn (9) ord (9') 
am replaced tcith 3?L i /»/ yy = 8, ) = 1 , . . . , n. 

Froo/. Let 37- \pc/u=hs, ; = Let yy = yy — /:/ -r S, where, 

of course, so 8 ~ < 0. Then 

rt r? n 

2 />< 7f/ = 2 pr/u — 2 — *) 

* *- 1 J-l : - 1 

r- n rt 

“ 3 Pt yi; ~ A/ ypi~& lPi 

t 2 t-2 i-I 


= — hj ~ro 

t-i 

— 8 . 

Also 

n [1 - F/;(y//)l - n [1 - - /!/ ^ 5)3 > n [1 — Mva)} 

t « 1 i 1 1*^1 

and 

l-nO;{y ( ;)>l-fiO;(-/«), 

l-l l 1 


because /■y(-) is a monotonic nondecreasing function and (—hj ~ 8)<0. 
Hence 

nil- Oj(yy)] ,> CTJ and 1 — FI 0/(y./) < =/. 
t<-i 1-1 

Thus the game f = (y,;) is feasible for (9) and (9') and gives 3T - 1 Fry// = S* 
Therefore the theorem is pro\ed. 

Corollary *. 77; r constraints 

v <£y<fr < p, z = 1, .... m 

Pi 

rV; (10) ar.d (10') am 6r replaced by equality constraints . 

Theorem 2. 8(1") n o nont/rcrecrmg /wnrffon o/ Ik 

Proof. If we consider two games with payoff matrices (Ay) and (c,j) such 
that 

hi] < C{j for all i and j f 

the value of the game with payoff matrix (cy) is greater than or equal to the 
value of the game with pa\oft matrix {bif}. 



160 


A. CHARNES, M. J. L. KIRBY, AND \V. M. RAIKE 


Corollary . p(<l>) « a nondecreasing function of O. 

Theorem 3 . ///or each j at least one atj is a continuous random variable 
then there exists a V on the boundary of Ti; that is , if for each j at least one ay 
is a continuous random variable, then all the inequality constraints of (9), except 
for the non-negativities , may be replaced by equality constraints. 

Proof. By theorem 1 the constraints Ptyv>&> / = n , may 

be replaced by equality constraints. Now suppose a / ; y is a continuous random 
variable. Then Fkj{ykj) is a continuous, nondecreasing function of yy. 
Hence, if YY?-i [1 — Fitly if)\ >ay, we can increase y/„y to yy ; y, where 

m 

n [1 -r„(yy)][ 1 -r»(y«)]=^. 

1 = 1 

Itl 

But by Theorem 2 increasing y/„y to y/,*y does not decrease S, and we have a 
new optimal solution in which the constraint [1 —F^(yy)] > a; is satis- 

fied as an equation. Thus the theorem is proved. 

Note that this result does not necessarily hold if for some column j of A 
all atj are discrete. In this case F{j(yij) is not continuous for any i = l, m f 
and so increasing y *y to y*; + c may result in the constraint on <xj no longer 
being satisfied. 

Corollary . If, for each i, at least one is a continuous random variable 
then there exists a <J> on the boundary of T%. 

Theorem 3 and its corollary are used in our discussion of pure strategy 
solutions. In addition, it appears that they are particularly useful when (9) 
or (10) is being solved as a nonlinear programming problem. 

Note that the analogues of S(T) and p(< J>) can be defined for (9') and (10'); 
by use of these definitions Theorems 2 and 3 and their corollaries can be 
extended to include these problems also. 

5. RELATIONS BETWEEN f AND 5 
Let T =T\ n T 2 , so that T is the set of games feasible for both players. 

Theorem 4. Suppose that either TYi=i (1 — ay) <fik for some k or 
(1— pi) <a r for some r. Then T is empty . 

Proof. Suppose T e T\. Then YY 1-1 0 ““ ^ j = ^ •••> Um 
Therefore for each k , A = l, m, 1 — j = 1, s ^ nce 

all of the factors are in the interval [0, 1]. Thus 

Fn{y kj ) <1 — ay for each k and j . 


n F ki(y/:j) < FI (1 — ay) for each k = 1, . . . , w. 


Then 



ZERO-ZERO CHAXCE-CONSTRAINED GAMES 


161 


Hcncc, if 1 (1 — °y) <" ft): for some k, we have f]"-! Ftjiykj) <ftk\ hence 
V <L Tz. In a similar manner we can show that if € T« then <I> £ T\ if 

Trt 

n 0 — fii) <a r for some r. 

i~i 

Corollary . Jf ctj > \ for some j and fit>l for some i, then T is empty. 

Proof Suppose jS* > Then 

II (1 ~ < min (1 - a;) < h < fi k . 

j 

This corollary is particularly important, for it shows that in most games 
of interest T is empty, since the fii, i = 1 , . m y and «/,/=!, ...» n arc 
all near one. Hence optimal action by both players will correspond to their 
playing different deterministic zero-sum games F for player I and <I> for player 
II, even though they are playing the same chance-constrained zero-sum game. 

Theorem 5. Suppose that <xj > J, j = 1, . . . , ti and fit > r = 1, . . . , m, 
then / = 1, ♦ m, j ~ 1, . « /or any />ofr of games r e Pi and 

<l>eP 2 . 

Proo/. T e Pi and a; > l imply that 

1 — F<j(Y(f)>h 

or 

FijiVit) <h * = 1, ...» m. 

Similarly, ( 1 ) e Ps and fit>l imply that F/;(<£//) > 4, / — 1» • •• > n. Hence 

( 11 ) 

and the fact that Fy(*) is monotonic nondecreasing gives yz/ <<£*/, all i , /. 

It is important to note that this theorem cannot, in general, be extended 
to include the case of a 5 = i, /3 r = i for some r and 5, since then wc would 
get 

Frs(yr,)^Frs(<f>rs) (12) 

instead of (11) when ;=r, j~s; but (12) can be satisfied by y rSf firs with 
6n<yrs if yr$» lie in an interval over which F rs (-) is constant. This, 
however, shows an extension of Theorem 5. 

Theorem 6 . Suppose that l <. a/ < 1, j == 1, ...» n, l <C fit < 1, f = 1, 
.... m, and j = 1, ..., w, /==!, . «, w o strictly monotonic incrcas - 

iVjg /V/ncZ/o/j w/irnrtrr 0<F*;(.v)<l. Then yu <;<£*/, z = 1 , m f j ~ 
1» • • • , w. 

Theorem 7. Suppose that «/>{,/=?!, . w, one/ fii>h * — 1> w. 
P/;r« S(F) < p(<I>) /or F e Pi ( h e P 2 * 



162 


A. CHARNES, M. J. L. KIRBY, AND W. M. RAIKB 


Proof. Consider, for fixed <&e7i, the dual to the linear programming 
problem max S, subject to 

m 

Hpi — 1, pi> 0, i' = l, 

i-1 

771 

2 piyv>^ j = l, n. 

Let z(F) denote the optimal value of the dual problem. Then by the dual 
theorem of linear programming z(T) = S(T) ; but S(r) < §(r), and 
z(F) <p(<5) for any since by theorem 5 yij < c fa . Hence S(r) < E{V) 

= z (T) < p(<J>) < p(O) and the theorem is proved. 

Corollary. Suppose that \ <1 a/ < 1, j = 1, . . . , n , i < < 1, 

* = 1, . ♦ . , m, and Fiji'), * = • * • > m, j — 1, . . . , w, w a strictly monotonic in- 

creasing function whenever 0< -PVj(#)<1. Then 8{F) < p{<&) for any VeT\ 
and O e TV 


Theorem 7 and its corollary are reminiscent of the relationship between 
dual problems in linear programming, for which the value of the objective 
function of the minimizing problem for any feasible solution is greater than or 
equal to the value of the objective function of the maximizing problem for 
any feasible solution to the maximizing problem. 


Theorem 8 . 

(a) p — B < max (<£#) — min (y*/) 

i,5 i,j 

(b) p-S< max ^ Zfa) - m j n (" .1 rv) 
for any T e T\ and ® e T 2 . 

Proof. 

(a) Clearly S > min * ( jiyy), since this minimum is the worst possible payoff 
from player I 5 s point of view. Similarly, 

p < max (fij). 


(b) 


Since pt — l/?w, i = 1, 
have 


. . . , m, is a feasible strategy for (9) for any T, we 




In an analogous manner we obtain 

This theorem provides estimates of the gap between the value of the 
optimal game to player I and the value of the optimal game to player II. 



ZERO-ZERO CHANCE-CONSTRAINED GAMES 


163 


6. THE SPECIAL CASES OF aj = 0 AND 

aj= 1, 1, n 

Let us now consider the case in which a; = l, ^ = 1, ...» n , 

/ — 1 4 and all tty are random variables with a finite range. Then 

nti- Fiiim)] > a j—i 

implies 

1 — Fifoi}) = 1 , or Fijiytj) = 0, 

so that the only feasible are those for which F/;(y^) — 0, 2 = 1, m t 
j = l, . n; but, by Theorem 2, S(T) is monotonic nondecreasing; hence 
wc have that 


YU = "iax{y,/: £#(?(/) = 0}. (13) 

Note that because /*(;(•) is left continuous we can use max rather than sup. 
Hence in this case (9) becomes max 8, subject to 

tn 

ypt—i, pi>®, t =1, (H) 

i-1 

m 

Xpiyo^S, n, 

f-i 

where y// arc defined by (13). 

Similarly, if wc define <fitj ~ min{<£f/: Fij((f>(j) — 1}, the problem faced by 
player II is min p, subject to 

71 

2$ij7j<P, (15) 

j-i 

n 

1, 

i-i 

Here wc see that if a f j is a constant for all i and j then yt/ = <£^ and (14) 
and (15) arc dual linear programming problems similar to (3.2) and (4.2). 

Note also that if all aj~0 then (9) is unbounded, as is (10) if all 
Howe\cr, as long as at least one a/ > 0, (9) has a finite optimal value, since 
for this column j all entries y//, i = 1, m, will be finite and the value of 
the game will be finite. Similarly, (10) has a finite optimal value as long as 
at least one /?* > 0. 

Theorem 9. 8 is hounded from above if and only if aj > 0 for some j and p 

is bounded from h cloze if and only if fi( > 0 for some i. 

Since (9) and (10) are always consistent, Theorem 9 gives a necessary and 
sufficient condition for the chancc-constraincd game to be meaningful. 



164 


A. CHARNE3, M. J. L. KIRBY, AND W. M. RAIKE 


7. PURE STRATEGY SOLUTIONS 

In this section we investigate what conditions are sufficient and what con- 
ditions are necessary for p and q to be pure strategy solutions. We assume 
that all oLj y fit are strictly positive. Let 

Y*j = max (y« : FiAVij) < 1 - «,} 

and let 


<f>fj — min >ft}. 

Let T* = (y§) and O* = ($£). Then T* and are not in general feasible 
games (i.e., r*"^Ti and O* £ T 2 ). In fact, if jF#(*) is a strictly monotonic 
function for all i and 7 , T * and will be feasible if and only if ay = l for all 
j and fit = 1 for all t. This follows from the fact that when jfy(-) is strictly 
monotonic y§ and <f>% satisfy F^y^) = 1 ~aj and = j 8 t * and so 

7n 

t=l 

if and only if = Similarly, 

ni^y) =(&)":> ft 

i = 1 

if and only if /?{ = !. 

However, any game T obtained by choosing y^—y# for one element in 
each column and y*/ — — co for all other elements is a feasible game, since 
then 1 — Ffj(yij) = 1 except for the one element in each column for which 
1 — F { j(yij) = 1 — Fij(ytj) > cq . We are now going to consider a particular 
feasible game chosen from T* as already outlined. We let f denote the game 
chosen as follows; 

Let row k be the row for which 


Let 


yf :r = max min yfj. 
i j 


w = 



7 = 1 ,..., n,i = k t 

otherwise. 


Then, as we have indicated, f 1 e 7i. Let V(T*) be the value of the zero- 
sum game with payoff matrix T*. 


Theorem 10. If F(r*) = §(f), then 

_ ( 1 , 

^-( 0 , i^k, 


and f, are optimal for (9). 



zcro-zejio chance-constrained games 


365 


Proof, y* has been chosen so that it is at least as large as yy for any 
feasible game F; that is, if yr/>y,*_f or any i and /, then Vi T\. Hence 
K(P) provides an upper bound on S(F); but f is feasible for (9), and the /If, 
/ — 1, m t defined above give 

5(1") = F(F»), 

since f has a saddle point, so that 

=y*r- 

Hence the theorem is proved. 

Corollary . If F* has a saddle point , then p , F, S(P) arc, optimal for (9) 
7 there p t P are as defined in Theorem 10. 

Proof If F' has a saddle point, then it can be found exactly as we found 
*/?,. Hence F(F*)=y£, = 8(f). 

Theorem 10 and its corollary provide us with two ways of testing (9) for 
a pure strategy solution. It is easy to determine whether T* has a 
saddle point, and if it has not we can then compute F(F*) by using one of 
the well-known methods of finding the value of a deterministic zero-sum 
game. Since S(P) = y* r , we then check to see whether F(r # )=:y£ r ; if it 
docs, an optimal pure strategy is at hand. (If F(F # )^yJ r , we still cannot 
exclude the possibility that there is some pure strategy that is optimal.) 

Theorem II, If p is a pure strategy , then t P is an optimal game. 

Proof. Suppose pi — jp’ \~ f< h 

for the optima) game P. Then S(P) ~ min j (ye/), for if p is a pure strategy 
F has a saddle point; but y*j^>yij by our definition of y*. Hence P is a feasible 
game for which y*/ yi} , j = 1 , ...» n, and so min/ (yij) ;> min; (y*/). Hence 
&(F)2>5(F) and so P is an optimal game. 

Theorem 11 gives the useful result that if we know there exists an optimal 
strategy that is a pure strategy (or if for some reason player I limits himself 
to pure strategy’ solutions^) we can find the optimal game and strategy. We 
do this by first computing y f * for all i and j and then finding P. 

Theorem 12. S(P) < 6(F) < F(F*). 

Proof This follows directly from P being feasible and our definition of 
P*. It is clear that there arc identical theorems for the game which has 

1 f#J» i = 1, , m, ;=r, 

™ ~~ \ — oo, otherwise, 

t In Section 9 we indicate briefiy a situation in which this micht occur. This idea is 
med in our subsequent development of certain multiperiod models. 



166 


A. CHARNES, M. J. L. KIRBY, AND W. M. RAIKE 


where column r is the column for which <f>* r = min; max* </>*•. Note also that if 
a; = \ — fit f° r ah * and j and that for each i and; there exists an xy such that 
Fij(x{j) = \ and Fy(xy — e) < |, \ < + e) for any c > 0, then y*=<f)* jm 

Theorem 13. If a; = i= ft for all i and j, if Fy(-) has the properties described 
above , and if F(r # ) = S(f) = p(<f>), then both players will play pure strategies 
and 

S(?)=p($)=ytr = 

In other words, the value of the chance-constrained game will be the same 
to both players. 

Proof The proof follows directly from Theorem 10 and the correspond- 
ing theorem for the and $ games. 

Corollary. If = for all i and j and Fy{f has the properties 

described above and if V(T*) = S(F) = p(4>), then the distribution of payoffs to 
the chance- constrained game is given by the distribution of ajc r . 

Thus we have established a sufficient condition for the chance-constrained 
game to have a saddle point. 


8. EXTENSIONS OF THE MODEL 

One way of extending the ideas discussed above is to consider a situation in 
which, initially, the players know only the ranges for the parameters of the 
distributions of the ay with certain degrees of confidence rather than the exact 
values for these parameters. Then, with each play of the game, the resulting 
payoff provides the players with further information about these unknown 
parameter values. In this case the problem for player I becomes one of finding 
a set of strategies p&\ k = 1, 2, 3, . . . , where p^ should be used on the kih 
play of the game. Clearly pW should take into account all the information 
that player I has accumulated over the first k—\ plays of the game. Thus 
pM is no longer a zero-order decision rule but fits into the class of ^-period 
decision rules discussed in [3], [4], [8]. 

It is also clear that under these circumstances we must examine critically 
the basic premises of deterministic zero-sum game theory. In such a situation 
is it a useful as well as meaningful concept for player I to choose pW to be a 
mixed strategy when he will use it for only one play of the game? Perhaps 
player I should be limited to selecting a pure strategy as p&\ If we do require 
to be a pure strategy, the results of Section 7 can be of considerable help 
in finding the optimal set of strategies p&\ k = l, 2, 3, .... 

A further point in discussing such a time-ordered sequence of n plays 
of a game concerns the question when this sequence can be decomposed into 
a set of n time-independent games, each of which is to be played exactly once. 
The answer here depends on several factors. Does the choice of a certain 
strategy on play k limit the strategies that are available on play I n a 

great many military conflict situations this will happen. Is it ever best to 



ZERO -ZERO CHANCE-CONSTHAINED GAMES 


367 

attempt to deceive the opponent on one play of the game by not playing op- 
timally in order to trap him into a large loss at a later p!a\ r This, too. would 
depend on the panicular situation being modeled. Thus v.*e see that there 
is a variety of models — all of which fit into the general theory of multiperiod 
chance-constrained programming models — which bear further investigation. 

Similar models cfeve/op if we assume that ah ay are deterministic but that 
a player cannot guarantee with certainty that his strategies will be imple- 
mented. In other words, a player may find himself in an environment over 
which he does not have complete control. Thus he may decide on a certain 
strategy which, because of nature or his opponent, is rarely carried out. 
Under suitable assumptions such a situation can be converted into a model 
of the type discussed above, 

9. SOME EXAMPLES 

In this section we give three brief examples that illustrate some of the results 
in Sections 6 and 7. 

Example 1. Let m — 3, n ~ 4, and suppose that 


— l for all i and /. 


Let ctj be f 1), x = l ...4, v; 

following table: 

Table 1 

here the /i*/ are shown in the 

8 4 

9 

3 j 

9 7 

8 

7 

3 6 

5 

9 

Then, since = r A for a normal random variable, Table 3 also con- 
tains y* r since y* ~jii } ~6*; but if we regard Table I as 2 matrix game 

we see that it has a saddle point in row 2, column 2. Hence by the corollary 
to Theorem 10 

— co —* x 

— x 

1 - ! 

9 7 

8 

| " | 1' — 

— cc — x 

— X 

1 ~ X 1 


and p r — (0,1,0) and q 7 ~ (0, 1,0, 0). Moreover, by Theorem 13 c(V) 
" p$ } ) — 7, and wc have an example of a 007 came. 

Note also that, by the corollary to Theorem 13, the distribution of pavoffs 
to the chancc-constraincd game is an .Y(7, I) random variable. 





168 

A. CHAKNES, M. J. L. KIRBY, AND W. M. RAIKE 

Example 2 . 
Let 

Suppose again that ay is A r (/xy , 1) with /xy given by Table 1. 


a.) = .95, ;=1,2, 3, 4. 

and 

ft = .99, i = 1, 2, 3. 

Then 

TW 

1 

II 

%'S* 

and 

= Pi) -r 2.327, 

and so 

S(f') = 7 — 1.645 = 5.355 

and 

/>($) — 7 -}- 2.327 = 9.327. 

Also 

pT = (0, 1, 0) and q * = (0, 1, 0, 0). 

Thus we have an example of a situation in which both players choose their 
optimal strategies by facing different deterministic games, while the chance- 
constrained game has a saddle value. 

Example 3. 

In this example we let a/ — 1 = for all / and j. Let ay 


be uniformly distributed over (yy, zy), where yy , zy are shown in Table 2: 


Table 2 


[4, 12] 

[2,6] 

[2, 16] 

[2,4] 

[2, 16] 

[2, 12] 

[6, 10] 

[3,11] 

[2,4] 

[5,7] 

[2, 8] 

t 

[8, 10] 


Then, because yy = max{yy : /’y(yy) = 0 }, we find that yy * s shown in 
Table 3: 


Table 3 


4 

2 

2 

2 

2 

2 

6 

3 

2 

5 

2 

8 





ZERO-ZERO CK\NC£-CO\ STRAINED GAMES 


169 

Solving F for p , uc find that /? r =(i r ;f, ^3, * 3 ) and 5 — 2 ] 5- Similarly, since 
F//(<i//) = 1}, we find that 

Table 4 


12 

6 

16 

4 

16 

12 

10 

11 

4 

7 

8 

1 

10 


and = (0, 0, ra, ia) and p = IO/5. 

Thus by playing p placer I can assure himself an expected payoff of at 
at least 2 ] 5 with probability 3 , %\hcreas player II can guarantee that his ex- 
pected loss will be limited to at most 10 /s with probability 1 . Note that the 
large gap between S and p is due to the extreme conservatism expressed by 
all a/, j8 / being equal to 1 , Also note that the gap estimate of Theorem Sa 
has actually been realized here. 


10. REFERENCES 

[I] A Ch arnes and \V \Y Cooper, “ Chance-Constrained Programming/* Manage- 
ment Science, 6, No 1, 73-79 (October 1959) 

{2J A Ck ARNES and W. \V. Cooper, "Deterministic Equivalents for Optimizing and 
Satisficing Under Chance Constraints/* Operations Res , 11* No 1, 18-39 (Januan- 
Fcbruan 1963) 

[3] A Ch arnes and M J L Kirby, ** Optimal Decision Rules for the n-Penod E-Model 
of Chance -Constrained Programming/’ Cahiers du Centre d'Etudes de Recherche 
Opirationnelle (forthcoming) 

[4] A Cuarnes and M J L Kirb\, “Optimal Decision Rules for the Triangular 
E-Model of Chancc-Constraincd Programming/' J Caradian Oprrofioro/ Res 
Soe (forthcoming) 

[5] \ Charnes, 'W \\\ Cooper, and G L Thomson, " Constrained Generalized 
Medians and Linear Programming Under Uncertainty” Mara^erreri Science, 12, 
No 1, 83-112 (September 1965) 

[6] A Chaknes and \V AY CoorER , Merat*er*ent Models erd industrial Applications of 
Ltrear Program# irq, \'o\s I and II, Wiley New ^ork, 1961 

XT) A Ben -ISRAEL, “On Some Problems of Mathematical Programming/' PhD 
dissertation in Engineering Science, Northwestern University June 1962 

[S] M J L Kmm, "Generalized Inverses and Chance-Constrained Programming,” 
PhD dissertation in Applied Mathematics, Northwestern University June 1965 


JEUX ZERO-ZERO A CONTRAINTES 
PAR LE HASARD 

r£ 5 UM£ 

Cct expose cst Ic premier d'une seric d'etudes. Lcs auteurs Aculent etendre 
la theoric dcs jeu\ comiderant des matrices de “paAoff” sujet a des \anations 
probabihstiqucs. D’abord, les notions dcs strategics minimax ct mavimin 




170 


J. M. DOBBIE 


sont etendues. L’extension est accomplie par une generalisation, a maniere de 
la programmation contrainte par le hasard, les problemes de programmation 
lineaire qui equivalent a un jeu a deux joueurs de somme-zero. On derive 
des problemes equivalents deterministiques de type non-lineairc, pour 
chaque joueur. Les auteurs etudient les proprietes de families de solutions 
et des fonctions d’objectif de ces problemes. Quelques uns des resultats sont 
pareils a d’autres resultats dans la programmation mathematique. Parmi les 
cas etudies, les auteurs considerent la derivation des conditions suffisantes 
pour Pexistence d’une strategic pure optimale. Des exemples numeriques 
sont presentes et plusieurs extensions aux notions introduites ci-dessus sont 
traitees. 


SOLUTION OF SOME SURVEILLANCE-EVASION PROBLEMS 
BY METHODS OF DIFFERENTIAL GAMES 

Solution de Quelques Problemes Surveillance-Evasion 
par les Methodes de Jeux Differentials 

J. M. Dobbie 

Arthur D, Little , Inc,, Cambridge , Mass, 

United States of America 


1. INTRODUCTION 

Some important capture-evasion and surveillance-evasion problems can be 
formulated as differential games. In this article we solve a sequence of sur- 
veillance-evasion problems of this type that occurred in a study of the require- 
ments for the surveillance of ocean areas. 

The problem is one of close surveillance that is sometimes called “tracking,” 
“trailing,” or “tailing.” The last name probably is the most appropriate one, 
for the objective of one antagonist, the “ pursuer,” is to maintain contact on the 
other antagonist, the “ evader,” who attempts to break contact. 

The pursuer and evader can choose strategies for their motions from sets 
of admissible strategies, subject to various constraints. The strategies are 
piecewise continuous vector functions of time, which consist of speed, course 
angle, and radius of curvature in our problems. The constraints are limitations 
on the speed ranges and on the radius of curvature in changing course. A choice 
of strategy for the pursuer and for the evader determines a state vector function. 
In our problem it is the vector of rectangular coordinates of the position of the 
evader relative to the position and heading of the pursuer. 



SOLUTION OF SOME SURVEILLANCE-EVASION PROBLEMS 


171 


In the general formulation of a differential game there is a pavoff functional 
of the strategy vector functions and the state vector function, which one an- 
tagonist wants to maximize and the other wants to minimize bv a given time. 
Such games, in which the payoff usually can have a continuum of values, are 
called “games of degree” by Isaacs [1]. He uses the term “games of kind ” 
when the payoff has two, or a finite number, of values, and no limitation is 
placed on the duration of the game; the game is said to terminate when the pay- 
off first attains one particular value. 

The problems considered here are games of kind. We start with a simple 
problem in which the pursuer wants to keep the relative position of the evader 
inside a circle, called the detection region, and the evader wants to escape from 
the detection region. Surveillance is said to be maintained, that is, contact is 
held, as long as the relative position is inside or on the boundary of the circle. 

Surveillance ends, and the game terminates, when the evader crosses the 
boundary of the circle. Thus there are two payoff values, one for the position 
inside or on the boundary of the circle and the other for the position outside the 
circle. If the evader starts in a given state, can he escape or can the pursuer 
maintain contact indefinitely? We want to find the surveillance space, which 
is defined as the part of the initial state space for which the pursuer can maintain 
contact indefinitely. The restriction on the detection region is then removed. 
The problem is solved for an arbitrary’ detection region. 

Finally, a problem involving three payoff values is solved. The first value 
determines a continuous surveillance space, as already described. The second 
value determines an intermittent surveillance space from which the evader can 
escape but cannot prevent recapture by a recontact maneuver by the pursuer. 
The third value determines an escape space from which the evader can escape 
and the pursuer cannot be certain of recapture by a recontact maneuver. The 
solution is then applied to a two-speed problem in which the pursuer is restricted 
to a speed range for detection capability but can use higher speeds (with loss of 
detection) to get into position for his recontact attempt. 

Isaacs [1] has shown that differential games include a wide variety of inter- 
esting problems and that an apparently minor change in the conditions can 
produce a major change in the solution. This fact is illustrated again by the 
problems solved here; for example, our first problem is the same as Isaacs’ 
homicidal-chauffeur pursuit problem, except for the objectives of the two 
antagonists. In the surveillance problem the evader is in the interior of the circle 
and trying to cross the boundary, whereas in the pursuit problem the evader is 
outside the circle and trying to avoid contact with the boundary. This change 
produces a fundamental difference between the two problems; a part of the 
solution of the surveillance problem has no counterpart in the solution of the 
pursuit problem. 

Readers who arc familiar with control theory will recognize that control 
problems may be described as one-sided differential games. A comparison is 
given by Ho, Bryson, and Baron [2], and by Ho [3] in his excellent review of 
Isaacs' book. 



172 


J. M. DOBBIE 


2. A SIMPLE SURVEILLANCE PROBLEM 

In most of the surveillance problems of interest the pursuer has a speed ad- 
vantage over the evader but is at a disadvantage in maneuvering. The assump- 
tions used by Isaacs [1] in his homicidal-chauffeur pursuit problem have these 
properties. 

We adopt the standard naval reference system, with the pursuer at the origin 
O and the heading of the pursuer along the upward vertical axis. Angles arc 
measured positively in the clockwise direction. On this clockwise polar co- 
ordinate system we superimpose a rectangular coordinate system, with the 
positive j-axis along the pursuer’s heading and the positive *-axis horizontal 
to the right. Let 

si = pursuer’s speed, w \ = maximum speed of the pursuer, 

52 = evader’s speed, ^2 = maximum speed of the evader, 

R ss minimum turning radius of the pursuer, 

X Y = evader’s course relative to the heading of the pursuer, 
d = distance at which the pursuer can detect the evader, 

(re, y) — rectangular coordinates of evader’s relative position. 

Assume that the pursuer may choose any speed si at time t in the range 0^ 
^1 < u>i and may turn right or left with a radius of turn no smaller than R, 
Hence the center of the turn is at a point (i?//z, O), with — 1 </x< 1. Assume 
that the pursuer can maintain his speed in the turn. Also assume that the 
evader can choose a speed S2 in the range 0 < S2 < W2 < w\ and a course angle 
T* in the range — tt < t F< 7 r without restriction on his radius. Thus the evader 
can vary his course angle freely and can include instantaneous jumps. (This 
assumption is not a severe limitation on the applicability of the results, for the 
optimal course angle for the evader in real space is usually constant on the 
boundary of the surveillance region.) 

The evader chooses 52 = *2(0 and 1 F = T F(0, subject to the constraint on 
S 2 , in an effort to escape from the circle. The pursuer chooses s\ = Ji(/) and 
fj, = fx(t) t subject to the constraints on si and /z, in trying to prevent escape. 
Assume that the pursuer knows the position and course of the evader when the 
distance separating them is <id but loses contact when the distance exceeds d. 
Assume that the evader always knows the position and course of the pursuer. 

The detection region may be divided in two, the surveillance region and 
the escape region, either of which might be void. The surveillance region is the 
part of the detection region for which the pursuer can maintain contact in- 
definitely, regardless of the strategy of the evader. The escape region is the part 
of the detection region from which the evader can escape, regardless of the 
strategy of the pursuer. Only on the common boundary, called a barrier by 
Isaacs, are the optimal strategies uniquely determined. We seek the equations 
of the barrier, or barriers, that separate the surveillance region from the escape 
region. 



SOLUTION OF SOME SURVEILLANCE-EVASION PROBLEMS 


173 


The equations of motion are 

— *07* 




R 

SlXfl 

R 


4- 52 sin T\ 


+ s 2 COST — S\. 


( 2 . 1 ) 


These equations can be obtained by elementary methods. They are derived 
by Isaacs ([1], p. 30). The first term on the right-hand side is the velocity 
component along the rectangular axes induced on the evader’s relative motion 
by the change in direction of the pursuer while turning. The second term is the 
direct component from the evader’s motion. The term — s\ in y is the com- 
ponent induced by the motion of the pursuer along the y-axis. 

At a point of a barrier let v\ and vz be the a- and y-components of a unit 
normal vector in the escape direction. The velocity component along the 
normal is 

52(vi sin T* + 1'2 cos T), (2.2) 

where 

a =yi'i — xi> 2 . (2-3) 

To minimize (2.2) the pursuer chooses 


(ficc \ 

•<TJ +yv 2= — nl -g + P2 I + 


n — W l> /* = sgna. 

To maximize (2.2) the evader chooses 

So ~ ti' 2 , sin T * = vu cos X F ~ vz • 

Hence on a barrier the antagonists are traveling at maximum speed and the 
pursuer is turning with full rudder in the direction determined by sgn a. 

With the optimal values the velocity component along the normal on a 
barrier must be zero. Substituting in (2.2), we have 

— c(yn — .vi * 2 ) — rci re rco = 0, (2.4) 

where 

tci 

If Sgn “ 

Also, we have 

Vi 2 4" “ 1 

The equations in (2.1) become 

x = — cy +rr 2 i»i, 

(2.7) 

y == CX 4* tt'2 1*2 — rci. 

Equations 2.4, 2.6, and 2.7 determine a barrier. 


(2.5) 

( 2 . 6 ) 



174 


J. M. DOBBIE 

Differentiating (2.4) and (2.6) and substituting for i> 2 from (2.6), and x 


and y from (2.7), we get 


vz = cv 1 , 

(2.8) 

from which 

V\ — — CV 2 

(2.9) 

It is now easy to solve (2.8) and (2.9) and then to substitute 

in (2.7) to solve for 


x and y. 

Let t = 0 at a point A of a barrier for which yv i = , so that the normal 

to the barrier lies along the radial line OA. From (2.4) and (2.6) we have 

Zl)9 / 

V 2 — — 7 >, vi = ±Vl — p 2 , at t = 0, 

M=p) 

which give two possible initial directions. Let 0 be either angle in the interval 
— 7 t < 0< 7 T for which cos 9 — p . Then the initial values are v\ = sin 0, y 2 = 
cos 0, and the solution of (2.8) (2.9) is 

v\ — sin (0 — ct), v 2 = cos {6 — ct). (2.10) 

Now substitute v\ and v% from (2.10) into (2.7) and solve for x and y. The 
initial values can be written in the form 

x = u sin 0 , y — u cos 0 , at t = 0 , 

where u is a parameter with absolute value equal to the distance OA. The 
solution of (2.7) becomes 

# = (u + W 2 1) sin (0 — ct) + (1 — cos ct) R sgn a, 
y—(ii + wz t) cos (0 — ct) — (sin ct) R sgn a, 
and a in (2.3) becomes 

a = [cos 0 — cos(0 — ct)] R sgn a. 

Equation 2.12 shows that we must have 

cos (0 — ct)< cos 0 , 

which determines the pertinent ranges of t . 

In (2.11) there are two values for c (corresponding to the two values for sgn a), 
two values for 0, and u can be positive or negative. The curves represented by 
these equations are of four types: expanding and contracting involutes of the 
circles and with centers at (±i?, O) and radius r = pR . If we restrict 
attention to the right half-plane by imposing the condition u sin 0 > 0 on the 
initial value of there are only two types, as follows: 


(i) 

o' 

A 

a 

si* 

+ 

II 

<0 


u >0, 

(2) 

where 

a <0, 

si* 

1 

II 

<0 

II 

1 

-s- 

u <0, 


cos 4> = p, sin <f) — V\ — p 2 . 


( 2 . 11 ) 

( 2 . 12 ) 

(2.13) 



SOLUTION O? SOME SLTtTCrLL^NCE-rVASION TT OSLSJt? 




Figure 1. Surveillance region. «f;<U<c; . 

The first type is an involute of that expands as t increases and passes 
through the point (:/ sin 6, u cos 6) on the tangent line Sf \ to at t = 0 
(see Figure 1). This curve is the relative path uhen the evader attempts to es- 
cape in the outward direction. To prevent escape the pursuer must put the 
evader above Sf\ without allowing him to get out of the detection region, which 
requires u< d. This requirement stems from the fact that only above 
can the pursuer dose the range when the evader takes his optimal course to 
open the range. The rate of change of half the square of the range is 

xx ~r}y = Sz(x sin T — v cos 4‘) — rxx, 
and 


Hcncc the pursuer can close the range against the evader's efforts to open 
the range only when v > p \ -b v- ; that is, above Sf\ uhen x > 0. 

To get the largest surveillance region ve put = d . The corresponding 
curve has the equations 




176 j , dobbie 

and the time interval from (2.1 3) is 

2R 

h <t<0, — 6). (llfj 

terminates, at t = 0, in the point A at the intersection of 2£\ and the boun- 
dary % of the detection circle. It starts, at t = U , on the other tan cent ];-* 
provided d is large enough (see l2ter). 

There are three critical values for d as follows: 

di = R \ r l^i-- P (l-26~Y) , 

d 2 = R{\^X~p 2 ~ 2 p(— - 6)1 

^=r[ 1-5- ^W 2 - • 

With d — di, 2 d i is tangent to the line ^2 2t B and does not cross below it 
With d~ do , at / = *i, starts at the point C at which 2 £z is tangent to 
With d — do, 2% 1 is tangent to the negative y-axis. 

First, assume that d <T dx, so that 2 j\ does not cross below 2£ 2 . The evader 
can escape by running around after dosing the range, if necessary. The 
evader can dose the range when above 2 £ 2 in the right half-plane, since 

min max (xx A yy) 
f*8,T> in) 

The evader closes the range until he is on an involute of type (1) with u > J 
and then runs normal to this involute to escape. 

If d > d\, 2d 1 crosses below 2£ 2 2nd there is a surveillance region from 
which the evader cannot escape. The shape of this region depends Gn the 
value of d. If d > dz , 2di meets the negative y-axis 2nd forms a barrier in the 
right half-plane below 2£\. This curve and the boundary' f € of the detection 
region above 2£\, together with their reflections about the y-axis, form a heart- 
shaped surveillance region. 

If died <dzj 2d 1 crosses below 2£o but does not meet its reflection or. 
the 3 -axis. If dz < d < dz T 2d\ starts at / == t\ on 2 £<i , where it is normal to 2ft • 
This case is illustrated in Figure 1. If d\ <d <do, £81 crosses 2 £z at a time 
greater than / x and is not normal to 2 £ 2 2t the intersection. In either case !e: D 
be the point in which 2d\ crosses below' 2 £ 2 2s * increases and let d& be the 
distance OD. The pursuer can prevent the evader from running around Ji 
because of the region below* 2£o that is bounded by 2d\ and 2£ 2- ‘^ 5 re =^ n 

the pursuer can open the range by turning alternately away and toward tr.t 
evader and using care to keep the evader betw*een 2£z and 2d\. 


f-a%Vx* -fy 2 , y >0, 

l — m>v6: 2 —y 2 _ tny, y < 0. 



SOLUTION OF SOME SURVEILLANCE-EVASION PROBLEMS 


177 


The pursuer must be able to put the evader into the region between 
and 8)\ without allowing him to reach a point on J&? 2 between O and D. The 
equations of the barrier ^ 2 through D are obtained by putting u~—dn f 
0 — —(f)) sgn a = — 1, —wijR in (2.11) 4 The equations are 

.v = (d D — U'zf) sin — ‘-—-J — - cos ^ j I 

y = — {dp — u> 2 1') cos — —■ 

where t' is a time variable for which /' = 0 at D. The curve is a contracting 
involute of 

The region bounded by and their reflections in the j-axis is 

the surveillance region when d\ <d <d 3 . This case has no counterpart in the 
corresponding homicidal-chauffeur problem. If d 2 < d <d^, and have 
a common normal at D . On the barriers the path of the evader in real space is a 
straight line; the relative path Sfi* is produced by the pursuer turning away, 
whereas is produced by the pursuer turning toward the evader. If d\ < 
d<do y the evader makes a sudden change in his course angle at D\ in real space 
his path consists of two straight lines. 


— R sin 


Wit' 

If 


\ r 


3. SOLUTION FOR ARBITRARY DETECTION REGIONS 

The solution in Section 2 for the circular detection region can be extended to 
arbitrary detection regions. In fact, the solution in Section 2 has been made 
without explicit use of the fact that the detection region is circular. 

Again, we restrict our attention to the right half-plane, x £> 0. The analysis 
for the left half-plane can be made separately if the detection region is not 
symmetrical about the y-axis. 

The requirement for the existence of a surveillance region is the existence 
of a region above ££’\ and a region $? 2 below i? 2 , into either of which the 
pursuer can move the evader from any position in some region without 
allowing him to get out of the detection region. The largest such region Sf 
is the surveillance region. If ( d\ sin </>, d\ cos tf>) is not in the detection region, 
there is no surveillance region. Let d in (2.14) be a variable and increase it from 
tfVl — p 1 until first reaches the boundary of the detection region, such as 
point A in Figure 2. Erase the part of the 4< above’* A; that is, the part 
obtained with / >the time at A. Continue by letting d increase as long as the 
involute (2.14) can be followed for increasing i into the region above with- 
out passing out of the detection region. If a second critical point A* is reached 
with the barrier erase the part above A* and continue. In this way we 
construct the barriers . . . , of type (1) which arc needed to restrict 

the region so that the pursuer can put the evader above Z£\ without allowing 
him to escape. The parts of the detection region cut off by , such as 

the part below <B\ in Figure 2, arc escape regions. 



178 


J. M* DOBBIE 


Similarly, we construct the barriers of type (2) that arc needed to restrict 
the region so that the pursuer can put the evader below without allowing 
him to escape. If SS X meets the negative y-axis, no barriers of type (2) are needed. 
If does not meet the negative y-axis, construct the barrier through point 
D at which S9 X intersects j£? 2 • In general, @ x and will cut off an escape re- 
gion. Also construct any other barriers of type (2) that are needed, such as $’ a 
in Figure 2. This can be done by considering dp in (2.16) as a variable and 
letting it decrease. The barriers ^ 2 > • • • > of type (2) will also cut off escape 
regions. 

Barriers of types (1) and (2) will suffice for all detection regions except the 
highly pathological ones. A complete treatment, which is not undertaken here, 



Figure 2. Surveillance region for noncircular detection region. 


requires the use of barriers of types (3) and (4), which are contracting involutes 
of and expanding involutes of Jfty 

4. INTERMITTENT SURVEILLANCE REGIONS 

We use the assumptions of Section 2 on speeds and turning radii. We no longer 
say, however, that surveillance necessarily terminates when the evader leaves 
the detection region. We assume that outside the detection region the pursuer 
cannot detect the evader but that the evader can detect the pursuer and deter- 
mine his course. 

We extend the concept of surveillance to include situations in which contact 
is lost temporarily, provided it can be regained with certainty. For this purpose 
we define two surveillance regions: 




179 


SOLUTION OF SOME SUR\TILLANCE-EVASION PROBLEMS 

the continuous surveillance region. This is the part of the detection 
region in which the pursuer can maintain contact at all times, regard- 
less of the evader’s actions. 

<? z : the intermittent surveiUance region. This is the part of the detection 
region from which the evader can escape temporarily but cannot pre- 
vent the pursuer from regaining contact. 

If the pursuer has a speed advantage over the evader, the evader cannot be 
certain of avoiding recontact forever, regardless of what he does. Wc restrict 
the recontact maneuver of the pursuer to one “pass,” in which he turns until 
he calculates that the evader is in a suitable region ahead of him and then goes 
on a straight course in the expectation of regaining contact. If he fails to make 



Figure 3. Continuous end intermittent surveillance regions. 


contact on the first pass, we say that contact has not been regained and that the 
regain-contact effort has terminated. Further passes to find the evader are said 
to constitute a new search effort. 

We describe the solution for an arbitrary detection region but limit the 
illustrations to the circular detection region. Region Sf\ is obtained in Sections 
2 and 3. It might be void. For the conditions used in Figure 3 it is the region 
bounded by the detection circle and the barriers S\ and An evader 
outside Sf \ can escape from the detection region. Wc want to find the part Sf z 
with which the pursuer can regain contact with certainty, regardless of the 
actions taken by the evader. 

Draw the lines and if 4t normal to S£\ and respectively, and just 
touching the boundary 2T of the detection region. Draw the lines only above the 



180 


J. M. DOBBIE 


points at which they touch <£. The lines if 3 and if 4 meet at point B , which is 
on the y-axis if the detection region is symmetrical about the y-axis. The 
triangular-shaped region St y bounded by if 3 , if 4 , and the boundary of the 
detection region, is called the recontact region. If the evader is in St and the 
pursuer knows this, the pursuer can regain contact by traveling a straight 
course. 

According to our assumptions, the evader knows the location and course of 
the pursuer at all times, whereas the pursuer cannot detect the evader when he 
is outside the detection region. Hence, if the evader can escape from the detec- 
tion region and avoid being maneuvered into St at the time the pursuer stops 
his turn, he can avoid recontact. 

On the other hand, the pursuer can be certain of regaining contact by 
traveling a straight course after making a turn for a time interval that assures 
him that the evader is in the recontact region. Under what conditions can 
the pursuer be certain that the evader is in the recontact region at a particular 
time, regardless of what the evader does after escaping from the detection 
region? We want to find the largest part ^2 of the escape region for which 
the pursuer can be certain of regaining contact by this maneuver. 

From the point B of intersection of the lines if 3 and if 4 draw the involute 
S3 of if + , intersecting the boundary of the detection region in the point C. 
If the evader escapes at C and travels normal to ^ 3 , which is a straight line 
in real space, his relative path will be the involute S3 through the boundary point 
B of the recontact region 01. The involute S3 is also the relative path when the 
evader tries to maximize his distance from the pursuer at the time of crossing 
the y-axis, and the pursuer tries to minimize this distance. This fact can be 
proved directly; it follows from the properties of a barrier. If the evader escapes 
from the detection region at any point between A and C of the boundary', he 
cannot prevent the pursuer from putting him into the recontact region St at 
some later time. The pursuer, however, does not know when this will occur, 
except by computation. The pursuer can compute the expected position of the 
evader under various assumptions and terminate his turn at a time T, if possible, 
for which the expected position is in 0t , regardless of what the evader does. 
The evader can attempt to be outside 0 t at time T by tryung to stay to the right 
of $£3 or by trying to get through St and to the left of if 4 . 

Assume that the evader takes his optimal course to get to the right of if 3 
and stay .there. Let T3 be the maximum time after leaving the detection region 
at a point E between A and C before he is forced to cross if 3 into St . Simi- 
larly, let T 4 be the minimum time required for the evader to cross if 4 when he 
takes the optimal course to get to the left of if 4. If T 3 < T 4 , as would be the 
case for E close to A, the pursuer can turn a time T for which T3 < T < T\ 
and be certain that the evader is in St at time T. If T3 > , as would be the 

case for E close to C, the pursuer cannot be certain that the evader is in St at 
any particular time T ; he should either turn for a time slightly larger than Tz 
or for a time slightly less than T\. 

The pursuer can be certain of regaining contact when the evader escapes 
from the detection region at a point of the boundary' for which jT 3 < T4 and 



SOLUTION* OP SOML SURVEILLANCF-EVASIOV PROBLEMS 181 

cannot be certain of regaining contact when Tz>T 4 . Let 62 be the set of 
escape points on f € for which 7b < 7b Then Zfo is the subregion of the de- 
tection region from which the evader can reach a point of <f 2 when the pursuer 
ts attempting to present his escape. The barrier for will consist of barners 
of tjpes (1) and (2) through the boundar} points of 62 , which arc usuall) the 
points for which 7b — 7b. 

At the time of escape t 0 the evader can choose the course corresponding 
to max 7b or to mrn 7b. In the first case the optima/ refatne course angfe 
x V(t) is 

v r(/> -<£+<73-0= ^03 - a ( 4 . 1 ) 

At the evader chooses his relative course angle 

To 3-^ 3 +^ (4.2) 

normal to the position that J?3 will ha\e at t-= 7b and travels a straight path 
on this course in real space. The expression for x F(/) in (4 1) is the corresponding 
relative course angle produced b} the turning of the pursuer. The c\adcr can 
choose the optimal course angle T03 in (4 2) if he can compute 7s . A method 
of doing this is described later. 

The optimal relative course angle for the evader when he is trying to cross 
if 4 in minimum time is 


x m =-4> + c(T 4 ~t)= x F 01 - cl. (4.3) 

The initial course angle at the time of escape is 

T 01 = c7W, (4.4) 

normal to the position that <£ 4 will ha\e at /= 7V Again, the evader can 
choose X J at if he can compute T4. 

In both cases we can write 


x F(/) = To-rt. 

Then the equations of motion arc 

x =*= —cy 4- zv 2 sin (To — ct), 
y *5= C.V -f « 2 COS ('To — Ct) — V) 1 . 

The solution of (4.5) is 

x — /?(1 — cos ct) + Xi 2 1 sin (To — ct) + r 0 sin ( 0 o — ct ), 
v ~ —A! sin ct a. U2 1 cos (To - ct) 4* r o cos {Oq — ct ), 


(4.5) 


(4.6) 


where (r 0 , f? 0 ) arc the polar coordinates of the escape point. The curve repre- 
sented hv (4.6) is not nn in\olute but a 14 parallel ” curve obtained displacing 
an imolutc at a fixed distance along a line at a fixed angle (say 90") to the 
string. 



182 


J. M. DOBBIE 


Let /* 3 (/) be the distance from J$f 3 to the point ( x,y ), measured normal 
to in the outward direction, and let 63 be the distance from the origin to 
2 3 . Then " 

hs(t) — x sin (j) -\-y cos — Z > 3 

= R sin (f) — R sin (ct + <£) + w 2 t cos (To — ct — </>) 

+ Tq cos (00 — Ct — </>)— 0 3 . 

Putting To = + </> and t — Tz> we have 

/* 3 (T 3 ) = «; 2 T 3 ~ 63 + r 0 cos (0 O — (f) — cTz) + R sin cj) — R sin (cTz -f fy. 
Since this must be zero, J 3 is a solution of the equation 

w 2 T = b s — r 0 cos (cT + </> — 0 O ) + i?[sin ( cT + <£) — sin </>]. ( 4 . 7 ) 

The desired value of T 3 is the second positive solution; the first positive solution 
is the time at which the path crosses 2% extended, below jS?i. 

A similar derivation shows that T 4 is a solution of the equation 

w 2 T—b 4 — r 0 cos(cT — <£ — 0 O ) + 7?[sin (cT — <£) + sin </>], (4.8) 

where b 4 is the distance from the origin to 2 4 . We want the smallest positive 
solution of this equation. Let 

b 3 ) 

and 

r=/ 2 (<£, & 3 ) 

be the first and second positive solutions of (4.7). Then, from (4.7) and (4.8), 
we have 

T 3 ~/ 2 (</>, 63 ), 

T 4 =/i(— </>, 64 ). 

Differentiating (4.7), we have 

fg dfifa b) 1 

dcf) d<f> c’ * 

Hence 

7* =/ 2 (0, i 3 )-^ = ^ [« 2 (i 3 ) - (4.9) 

and 

n =/i(0, 64 ) + 1 ^ [ ai (Z»4) + <R (4.10) 

where ai(b) and az(b) are the first and second positive solutions of (4.7) when 
it is written as an equation in a by replacing cT + </> with a and 63 with b . The 
equation can be written in the form 

b — roc — ri cos (0i — a), (441) 

where (ri, 0 i) are the polar coordinates of the escape point measured from the 
center of JT + . Equation 4.11 can be solved easily by plotting the two sides and 
reading_QfL the first two positive intersections. 



SOLUTION OF SOME SUR\TILLANCE-EYASIO.\* PROBLEMS 


1S3 


The condition Tz< T\ becomes 

*2 (fa) — cc\(b A ) < 2<£. (4.12) 

This condition is applied to find the escape points of # corresponding to the 
intermittent surveillance region Sf*- The boundary of these escape points is 
usually the points, such as D in Figure 3, in the circular detection region for 
which Th— 74. Then Sfz is bounded by barriers, such as £${ in Figure 3, 
through the boundary of the escape points satisfying (4.12). 

The escape points on ?f, for which 7a — T4, can be found, when they exist, 
by solving (4.7) and (4.S) simultaneously for 60 and T after replacing r 0 with 
the function of 60 for c €. Thus for the circular detection region r 0 = d and 
63— 64 = Eliminating #0* we get 

p-k 2 = p-( 1 - cos cT ) 2 - (k -pcT-p sin cT)\ (4.13) 

where p ~ icz'\xc\ and k = diR. This equation can be solved graphically or 
numerically and the solution substituted in the equation 

d sin (cT — 60) = i?(l — cos cT) (4.14) 

to get the position angle do for D. Equation 4.14 is obtained by subtracting 
(4.8) from (4.7). The angle cT obtained in solving (4.13) is the angle through 
which the pursuer should turn when the evader escapes at the boundary* point D . 


5, A TWO-SPEED PROBLEM 

The extension of surveillance to the intermittent type in Section 4 was made 
in the solution of a two-speed problem, that in which the pursuer can detect 
the evader while using speeds no greater than xvi but losing his detection capa- 
bility at higher speeds. Thus he is restricted to speeds in the range 0 < s\ < xci 
while in contact with the evader but can use speeds up to (>rri) while get- 
ting into position for his recontact attempt. 

The solution is obtained from the results of Section 4 by replacing sci with 
tc[ in those places in the analysis at which the higher speed may be used, which 
depend on our assumptions of the pursuer’s tactics. Assume that the pursuer 
limits his speed to vci when the evader is inside the detection region, goes to 
his highest speed rrj when the evader escapes from the detection region, and 
decreases his speed to rri when he completes his turn and starts to run straight 
in his recontact effort. 

The solution is as follows. Let />' = rc* 2 ^, r'~ pR , c — xv\[R. Then 
Tz — p [*2(63) — 6 ]s 

where x\(b) and 3r»(A) are the first and second positive solutions of 

h — r a ~ r\ cos (£1 — a). 



184 


J. M. DOBBIE 


For the circular detection region the turning time T and the initial position 
angle 0 O for the boundary escape point D are obtained by solving the equations 

p 2 k 2 = __ cos c ' T y +{k— pcT p sin c'T) 2 , 

d sin {c'T — 6 0 ) = i?(l — cos c'T). 

6. SOME POSSIBLE EXTENSIONS 

The two-speed problem is an example in which the detection capability varies 
with speed. Can the methods we have described be adapted to solve problems 
in which the variation is continuous? If we allow probabilities of holding con- 
tact other than 0 and 1, our definitions of continuous and intermittent sur- 
veillance no longer have meaning. What are reasonable objectives of the pur- 
suer and evader under such conditions? Extensions in both directions are 
needed. 


7. REFERENCES 

[1] Rufus Isaacs, Differential Games , Wiley, New York, 1965. 

[2] Y. C. Ho, A. E. Bryson, Jr., and S. Baron, “Differential Games and Optimal 
Pursuit-Evasion Strategies,” IEEE Trans . Automatic Control , AC-10(4) 385-389 
(1965). 

[3] Y. C. Ho, Book Review, “Differential Games — R. Isaacs,” IEEE Tram. Automatic 
Control, AC-10(4) 501-503 (1965). 

SOLUTION DE QUELQUES PROBLEMES 
SURVEILLANCE-EVASION PAR LES METHODES 
DE JEUX DEFFERENTIELS 

RESUME 

Un probleme important pour la surveillance des regions oceaniques, et pour 
d’autres efforts de surveillance, est celui dans lequel un poursuivant essaye de 
maintenir contact de tracking avec un evahisseur qui par contre essaye de 
rompre le contact. Quels sont les strategies optimaux pour les deux antagonistes? 
Sons quelles conditions existe-il une region dans laquelle (a) le poursuivant peut 
maintenir contact sans cesse, (b) l’evahisseur peut rompre le contact sans 
empecher le poursuivant de regagner contact, (c) Tevahisseur peut rompre 
contact sans une reprise immediate du contact. 

Une sequence de problemes de complexity croissante sont traitees par des 
methodes proposees par Rufus Isaacs dans son livre Differential Games , Wiley, 
1965, et par des extensions de ces methodes. Sur base des assomptions em- 
ployees par Isaacs dans un probleme de poursuit correspondant — celui du 
chauffeur homicide — le probleme de surveillance est resolu pour des regions 
de detection arbitraires et pour un cas ou les capacites de detection du pour- 
suivant sont determinees par sa vitesse. 



STRATEGIES /t-OPTIMALES DAN'S LES PROGRAMMES DYNAMIQUES 


185 


STRATEGIES A-OPTIMALES DANS LES PROGRAMMES 
DYNAMIQUES STOCHASTIQUES FINIS 

k-Optimal Strategics in Finite Stochastic 
Dynamic Programming 

Arnold Kaufmann^* ct Roger CruonJ 
1. INTRODUCTION 

La methodc de programmation dynamique cst frequcmmcnt utilisee pour 
maximiser (ou minimiser) Pesperance dc gain (ou de pcrte) dans un problemc 
s^quenticl. Cepcndant, mcmc lorsque le critere de Poptimisation dc Pesperance 
mathematique parait traduire cn premiere approximation les desirs du re- 
sponsable dc la decision, cclui-ci souhaitc generalcmcnt pouvoir tenir compte 
dc criteres sccondaires, qu’il n’est pas toujours facile de prendre en comptc 
sous forme de contraintes. D’ou Pinteret, reconnu depuis longtcmps, d’explorer 
1c voisinage dc la solution qui apparait commc optimalc dans le modele. 

Lorsque Pcnsemblc dc$ strategies possibles est fini, une fa^on dc repondre 
a cc besoin est dc determiner des sous-enscmblcs n:* 1 ), rc (2) , . . . } de strategics, 
tcls que toutes les strategics appartennnt a tz u) sont equivalents, § ct qu’unc 
strategic de est strictcmcnt mcilleurc§ qu’unc strategic de rt^ si i < j. 
Nous basant sur Ic travail de Bellman et Kaiaba [2], nous avons dans [4] traite 
cc problemc dans le cas de programmes dynamiques cn avenir certain. Noils 
nous proposons ici d’etudier le cas, plus delicat, d’un avenir aleatoirc. 

Pour la simplicity de 1 'expose, nous utiliserons un modele (decrit au Section 
2) dans Icqucl les decisions ct les interventions du hasard sont separecs ct 
altcmccs dans 1c temps. Du point dc vue thcorique, ceci n’est pas rcstrictif, 
tout problemc sequcnticl fini pouvant etre mis sous ccttc forme. En pratique, 
il pourrait etre plus commode dans certains cas d’utiliscr un modclc different; 
les resuitats decrits ici devraient alors etre adaptes cn consequence. 


2. MODELE DECISION-HA SARD <D.H.) 

SOUS FORME DECOMFOSEE 

Nous utiliserons un modclc, que nous avons appcle [3] " modclc D.H. sous 
forme decomposec defini commc suit: 

L On considerc un graphe fini G *= (z, F) tel que: 

(a) z r=r x 0 u yj u xi u . * . u x.v-i u y.v u x.v = xuy 


t IiuH-G.E., France. 

t Centre Intcrarmccs de Recherche Opcrationnellc, France. 
? Au sens du criterc de Tcspcrancc mathematique. 



186 ARNOLD KAUFMANN ET ROGRr CRUOX 

ou les sous-ensembles x„ , n = 0, 1, . . . , et y n , n = 1, . . . , N sont non vidcs 
et disjoints, et ou: 

.v 

X— U x„ 

« ~ 0 

et 

x 

y= U y « 

71 - 1 

(b) Tx^-i = y n , n = 1 , . . . , iV 

(c) Tx n = 0 

(d) x e x — x,v => Tx ^ 0 

(e) ry rt — x n , n=l, ...,N 

(f) y e y => Py # 0 

Les sommets de x (resp. de y) et les arcs qui en sont issus sont appcles 
“sommets de decision ” et “arcs de decision” (resp. “sommets de hasard” 
et “arcs de hasard”). 

2. A chaque arc /x du graphe G est attachee une “ valeur ” v (y). En outre, 
a chaque sommet de hasard ye y est attachee une distribution de probabilite 




STRATEGIES /f-OPTIMALES DANS LES PROGRAMMES DYNAMJQUES 187 

sur Ics arcs issus de y\ chaque arc ayant unc probability non nulle (dans 1 c cas 
contrairc, on supprimerait fare). 

En pratique, le graphe G representera les evolutions possibles d’un systemc 
particllcment controle, soumis a dcs facteurs aleatoires. 

La figure 1 donne un cxemple de modelc D.IL sous fomie dccomposcc. 
Les nombres entre parentheses sont les probability associees aux arcs dc 
hasard; les autres nombres sont les valeurs associees nux arcs du graphe. 


3. STRATEGIES 


Unc “decision y en x*\ ou xex ct ye Fx, cst le rnarquage d'un arc (x, y) 
issu de at; on pourra designer ccttc decision par {x^y), ou simplement par y 
si x cst deja specific. 

Unc “strategic en y ”, ou ye y cst un sommet dc hasard, cst un ensemble 
dc couples, formes chacun d’un sommet x ' e Fy, ct d’unc strategic cn .v' (les 
strategics cn un sommet de decision sont definics plus loin): 


~(y) — 


a.-' e i y 


( 1 ) 


Unc “strategic en x” cst un ensemble vide si .rex.v; si .ve(x-x.v) f e’est 
un ensemble a un scul element, cet element etant 1 c couple forme par unc de- 
cision y en x, ct une strategic en y: 


A x )~y< *(}') 


( 2 ) 


La decision y qui figure dans le couple definissant tt(.v) sera appclec “premiere 
decision” dc la strategic en x. 

Les relations ( 1 ) ct ( 2 ) definissent un ordre particl dans Penscrnblc dcs 
strategics, unc strategic figurant au second mernbre de Tunc dc ces relations 
<!*tant “incluse“ dans la strategic figurant au premier mernbre. Nous appellcr- 
ons “appartcnance ” la fermeture transitive de ccttc relation d'ordre, ct nous 
dirons qu’unc strategic rr' appartient a unc strategic t r (ou que t t conticnt it') 
si t r figure au second rnembre dc la relation ( 1 ) ou ( 2 ) definissant 77, ou s’il 
exist e une suite dc strategics 771, 772, . 77/.- telle que la memo propriety soit 
vendee pour tout couple dc strategics succcssivcs dans la suite 77, 771, ...» 77*, 
77'* Nous dirons egafement qu’unc decision (x, y) appartient a une strategic 77 
[ou que 77 conticnt (x, jy)] si ccttc decision cst la premiere decision d’unc strategic 
appartenant a 77. 

En general, unc strategic nc peut etre rcprescntcc sans ambiguite par 1 c 
rnarquage d’arcs dc decision sur le graphe G . En efifet, unc strategic peut iticlure 
plus (Vutic decision cn un in erne sommet xex — x,v . Par contrc, unc strategic 
cn pent toujours etre rcprescntcc sans ambiguite par 1 c rnarquage d’arcs dc 
decision - ] sur Fnrborcsccncc G'(z) dcs chcmins du graphe G Lsus dc Unc 

t Les sommet* dc Tarhorcscencc G'(z) wont clav^c* cn sommet* dc decision ct 
sommet* tic hasard, suivant la nature dc lYxtrimitc terminate du chcmin dc G quhk 
rcpr£$entcnt; dc mtrm.% Ics arcs de Parborcscence scront classes sun ant la nature dcs arcs 
homolopucs dc (7. 



188 


ARNOLD KAUFMANN ET ROGER CRUON 



strategic en z apparait done comme une sous-arborescencef de I’arborescence 
G\z). 

La figure 2 donne un exemple de programme dynamique un peu plus com- 
plique que celui de la figure 1. Une strategic en .ro 1 est representee sur la 
figure 3 ; chaque sommet represente un chemin du graphe de la figure 2, mais 
pour simplifier on a indique seulement Textremite terminale de ce chemin. Sur 



Figure 3. 

f Plus precisement, on pourrait montrer que la definition donnee plus haut d une 
strategie est equivalente a la suivante: Une strategic en z est une sous-arborescence de 
G'(z) telle que tout sommet de decision y a un demi-degre exterieur egal a 1 (exceptes 
evidemment les sommets appartenant a X^')» et tout sommet de hasard yaun demi-degre 
exterieur egal a celui du sommet homologue de G. 



1S9 


STRATEGIES A’-OPTIM \LE5 DANS LES PROGRAMMES DVNAMIQt'ES 

la figure 2, les arcs corrcspondant a des decisions appartenant a la strategic ont 
etc traces en pointilles gras; I’ambiguitc sicnalee plus hnut apparait sur cettc 
figure, la decision prise en xo 3 dependant de revolution passee du systeme. 

La definition, a 1’aidc des relations (1) ct (2), de la strategic cn xq 1 se traduit 
par les relations de recurrence suivantes: 

-(yi 1 )= {[-vi 1 , -(.vi 1 )], [a-,2 «(*»*)]} 

'(*l l ) -= {V2 2 , -(V 2 2)} 

~bv) - b'A -(yi 1 )} 

-0'2 2 ) - lb'2 1 , -(-V2 1 )], [*3*. -(-V2 2 )]. fA-2 3 , .... tt(.Vo3)]} 

-b’2 1 ) - tt*2 2 , "(.Vo 3 )], f.V 2 3, ^'(A-2 3 )]} 

”(a*2 2 ) = {3'3 2 . <V 3 -)} 

~(.V2 3 )= b^.wO'a 1 )} 

-'(a'2 3 )= {^^(JS 2 )} 

-(J3 1 ) - b-A rr(.v 3 ')} 

rr(yr) = bv, -(.V3 2 ), [.V3 3 , ^(-V3 3 )]} 

-(* 3 l ) = 0 
-b3 3 )-0 
~(a‘ 3 3 ) = 0 

3.1. Strategies Markoviennes 

Par definition, une strategic en c e z est “ markovienne” si elle n’inclut pas 
deux decisions differentes en un memc sommet; cc n'est pas Ie cas de la strategic 
"(.vq 1 ) representee sur les figures 2 et 3, qui inclut deux decisions distinctcs en 
v* 3 . Unc strategic markovienne peut evidemment etre representee sans am- 
biguite par un marquage dc G. 

Les strategies markoviennes sont les seules qui sofent habituellement prises 
en consideration [1] [3]; la justification en sera donnec au paragraphe suivant. 


4. VALEUR D’UNE STRATEGIE — THEOREME DE 
A -OPTIMA LITE 

A unc strategic cn z £ x,v correspondent un certain nombre (fini) d 'evolutions 
possibles du systeme, suppose initialcmcnt dans Petat corrcspondant au sommet 
z\ nous appcllerons 44 realisations" ces evolutions possibles, Les realisations 
sont des chcmins du graphe G, obtenus dc la fagon suivantc: Si ze(x — x.v), 
1c premier arc du chemin est Parc marque par la premiere decision y de "(-), 
ct lc chcmin est complete p3r unc realisation de la strategic rr(y) qui appartient a 
tt(c). Si z e y, Ic premier arc (z y x) est Pun des arcs issus dc r, ct lc chemin est 
complete par unc realisation de la strategic tt(v) qui appartient a rr(r). 

La valcur d’unc realisation dc 77 (c) est la sommet des valcurs attachees aux 
arcs du chcmin corrcspondant; ia probabilite d'unc realisation est lc produit 
dcs probabilites attach ecs aux arcs dc hasard de cc chemin, 

* D "a birrs operations que Eaddition (notamrr.ent la multiplication) peuven: ctre 
utilise*. Voir [3] f pp 264-266. 



190 


ARNOLD KACEMANN ET ROGER CHITON 


L2 valeur de I2 strategic rrfr) est une variable aleatoire definie sur Fensemb^ 
des valeurs des realisations de cette strategic, la probabilite de chacune d’cL’ts 
eta nt la probabilite de la realisation correspondante. Si :ex«*, la valeur dt 
rrfz) sera nulle a vec une probabilite 1 . 

Xous supposerons, com me on !e fait habimellement en programmath*: 
dynamique, que la comparaison des strategies est basee sur Vcsperar.ee r-.ctrj- 
matique (e.m.) de leur valeur. 

II results imm ediatem en t des definitions precedences que, si i[n^r)] desrer.e 
Fe.m. de la valeur de 77(a), on a; 

* s (X - X.v), <*) = {[>', w(y)]} => *[*<*)] — '4 r . y) -f j[<v)] (3j 

J S y, rrfy) = {[*1, tt(xi)1 [x 2 , 77(0:2)], . . . , [x r , 7 r(x r )]} 

=■ = I p(y ; *t) -r (4) 

i = l 

En utiJisant le meme vocabulaire que dans [ 4 ], nous d irons qu’une strategic 
en z est /roptimale, si Fe.m. de sa valeur est le ^-optimum de F ensemble des 
e.m. des valeurs des strategies en z. Si Fon se restreint aux strategies marko- 
Rennes, on dira qu’une strategic (markovienne) en r est (& ~ .//)-optima!e si 
Fe.m. de sa valeur est le A -optimum de Fensemble des e.m. des valeurs des 
strategies markovienncs en z. 


Theorime 1 ( TMvreme de k-Optimalite). Si ur.e strategic 77(2') en z' appartkr.i 
a une strategic rdz) en z % et si sr(z) est k-optimale (kp>l), alors sr[z r ) at ucptinaU t 
azec i < k. 

Preirce. La propriete etant transitive, il suffira de la demontrer dans le cas 
pzrdculier oil z' e Tz. 

II exlste un i tel que z{z) soil z'-optimale, car les strategies sont en r. ombre 
finL Si Von avait i > k , il existersit k strategies en z r , que nous noterons 
.... 77^1(2^, dont les e.m. des valeurs seraient distinctes et meilleurs que celles 
de ziz f ). 

Supposons d’abord z e (x — acy), et soil : 

*l(z) = *">&)]}, 1= 1, 2, .... ft ( 5 ) 

On a, d’apres ( 3 ): 

f [tr y 7r(tr}] = nfa, 2^ — J[^, rr(r r )] 

et 

J[tr 7 m/fa)] = z(z, z) - s[z' f z il) (z)} 

Les e.m. des valeurs des k strategies en a deiinies par ( 5 ) seraient done 
distinctes et meilleures que celle de 77 (z), ce qui est contraire a F hypo these que 
7 r(z) est £-optima!e. 



191 


STRATEGIES /(-OPTIMALCS DANS LES PROGRAMMES DYNAMQIUES 

Si IS y, et si: 

Tr(xr) = rr(-)], [.VJ, 7 t(.V,)] [•'> , ”(•'>)]} 

on posera: 

*i(z) - {[z\ [*i, Tr(.vi)), . . . , [* r , tt(a>)]}, /« l f 2 A; (b) 

On a alors, d’apres (4): 

**[«. *i(*>] =/>(*. 

r 

+ 2 />(-. • v r)[l’(-i Ay) -f s(.v r , 7r)] 

ct Ton aboutit a la mcmc conclusion. 

Pour A* > 1 , il n’y a pas dc theorcme analogue pour Ics strategics mar- 
kovicnncs: tt( z) pcut etrc (A — • .//)-optima]c ct 7r(z')(i — .V/)-optimalc, avcc 
i > h\ nous cn vcrrons un cxcmple au Section 6 . Cc qui met le raisonncment en 
defaut dans cc cas, c’cst quc les strategics definics par (5) ou ( 6 ) nc sont pas 
ntkcssaircmcnt markovicnncs, mcmc si les strategics rr(z) y 77(d) ct 7 rW(d) 1 c 
sont. Par contrc, on pcut enoncer un autre thcoremc. 

Theorcme 2 , Une strategic cn z qui cst 1 — M-optimalc cst aussi 1 -optimalc. 

Autrcmcnt dit, il y a toujours, parmi les strategics 1 -optimalcs (ou plus 
simplcmcnt optimalcs) unc strategic markovienne. En diet, soit 7r(z) unc stra- 
tegic (1 — ^/)-optimalc ct tt'(c) une strategic 1 -optimalc. Si tt'(-) est marko- 
vienne, Pc.m. de sa valcur cst la memo quc cellc dc rr(z) (car unc strategic 
markovienne 1 -optimalc cst (1 ~ .//)-optimale), et le thcoremc cst d^montre. 
Si rr'(z) n’est pas markovienne, il cxiste un sommet z ' pour Icqucl i! cxistc dcs 
strategics 771 (d), 7rn(d), 7 r r (d) distinctcs ct nppartenant toutes a tt'(£); 
mais, cn vertti du thcoremc 1 , ccs strategics sont toutes 1 -optimalcs, done 
equivalcntcs, ct Ton pcut toutes les rcmplaccr par Pune d’entre dies sans changer 
Pc.m. dc la valcur dc 7r'(ar). En operant ainsi cn tout sommet ou ccla cst neccs- 
snirc, on obtient unc strategic markovienne 1 -optimalc, ct le thcoremc cst 
demontre pour la mcmc raison quc ci-dessus. 

Il cst facile dc voir quc, si Pon connait toutes les strategics markovicnncs 
optimalcs, on pcut rcconstituer toutes les strategics optimalcs cn combinant 
dc toutes les fa^ons possibles les strategies equivalcntcs cn un mcmc sommet. 


5. CALCUL DES STRATEGIES A-OPTJMALES 

Le theorcme 1 fournit un moven de calculcr les strategies A-oplimalcs. 

Soit/[ 5 °(.v) Pc.m. dc la valcur d’unc strategic A-optimale cn x e x n , ct 
Pc.m. dc la valcur (Pune strategic A-optimale cn y 6 y n * On a alors, d’apres le 
thcoremc 1 : 

/?’<*) = opt‘ n [f(.v, v) ,0')3 

y c fx 
i«"k 


(") 



192 


ARNOLD KAUFMANN ET ROGER CRUON 


ou k = {1, 2, . . . , k) et ou opt(*> designe le ^-optimum d’un ensemble de reels. 

La formule de recurrence permettant de cal culer g^(y) est un peu plus 
compliquee a ecrire. Designons par x\, x% , . . . , x r les elements de Ty, et par 
J~ > jV) le r-uple indiquant le degre d’optimallte de la strategie qui 

sera appliquee a l’etape suivante en fonction de Petat xi indique par le hasard 
On a alorsf 


gn(y ) = optf 1 2 p{y, x i)My, *i) -r/^fo)] (8) 

Jek /-I V 9 

Les formules (7) et (8) constituent Pequivalent, pour les probl ernes sto- 
chastiques, des formules donnees par Bellman et Kalaba [2], 

On peut cependant, par une analyse plus fine, obtenir une methode plus 
rapid e, comme nous Pavons deja fait dans [4] pour le cas d’une programme en 
avenir certain. Pour les sommets de decision, les formules sont d’ailleurs 
exactement les m ernes. 

On pose, Vy e Tx: 

M*,y )= i 

puis on calcule, successivement pour z = 1, 2, ..., &, les quantites ci-dessous; 
pour alleger les notations, on a pose a— jW(x, y) dans les expressions figurant 


au second membre: 

Jn\*)= °pt M x >y) -rinll(y)] (9) 

yeTx 

y*(x) — {y\yeFx, v(x, y) -fj&liCy) =/„’(*)} (10) 

tzM(x)= U {[y, 7r{y)} | tt(j) e iz^(y)} (11) 

yey*(x) 

jW(x,y) = cc- j-1, si yey*{x) (12) 

j<W(x, >-) = *. si y$y*(x) (13) 


Ainsi, pour i et y donnes, on utilise au second membre de (9), le plus petit 
entier a qui n’a pas deja ete utilise pour former une strategic 7-optimale en 
avec l<i. L’ensemble obtenu en (11) est Pensemble des strategies z-optimales 
en x , et (9) represente Pe.m. de la valeur de Tune quelconque de ces strategies. 

Le cas des sommets de hasard est plus delicat, la structure de domination 
des strategies en un tel sommet etant plus complexe. Soit un sommety ey, et 
soit comme ci-dessus: 


I >= {xu x z ,...,x r } 

Pour tout r-uple entier J — , • - - ,/r), nous designerons par 7Zj(y) Pensemble 

des strategies 7 r(y) en y qui sont telles que, pour tout / (/ = 1, , r), la strategic 
tt(xi) qui appartient a t r(y) est / r optimale: 

nj(y) — {? -r{y) | tt{xi) e l = 1, . - - , r} ( I4 ) 

t On remarquera que, pour k — 1, le r-uple J a une seule valeur possible. II n y a 
done pas, en fait, d’optimisation a faire. 



STRATEGIES &-0PTIMALE5 DANS LES PROGRAMMES DYNAMIQUES 193 

Unc strategic ?r(y) e 7tj(y) cst strictemcnt dominec par au moins m stra- 
tegies, ou: 

m== iO'»-l)== 2ji~r (15) 

/-I /« 1 

En efTet, une telle strategic est strictemcnt moins bonne que ccllc obtenuc 
cn remplagant la strategic tt(xi) par une strategic dont le rang d’optimalite soit 
(jt — 1); cn continuant a diminuer Iesj/, dans un ordre arbitraire, jusqu'a ce 
qu'ils soient tous egaux a 1, on a bien trouve m strategies meilleurcs que rriy). 
Ccci correspond au fait que les ensembles 7 tj(y) forment un demi-treiLlis, dans 
Icqucl un ensemble 7ij(jy) sc trouve au niveau (w + 1). La figure 4 donne un 
cxemple, dans le cas ou r~3. 



Figure 4. Dingramntc dc Ha^se du demi-trcillis des strategies, qui font candidates & 
1? 4-optimnbte cn un sommet dc hazard \ tel que jl%j -3. Chaque 3-uplc represente 
le rang dVptimalitc dcs strategics adoptees au sommet auquel le hazard peut arnener. 


194 


ARNOLD KAUFMANN ET ROGER CRUON 


Le theoreme 1 indique simplement qu’une strategic fe-optimale en un som- 
met de hasard y appartient a un sous-ensemble n j{y) tel que chaque composante 
de/ soit au plus egal a k. L’ application de'ce theoreme conduit a rechercher 
les strategies £-optimales parmi un ensemble de k r strategies “candidates” 
Mais les considerations qui precedent permettent de ramener le nombre des 
ensembles candidats a C£ +r _ x , nombre de solutions de Pinequation (16) figu- 
rant dans le theoreme ci-dessous. 

Theoreme J. En un sommet de hasard y tel que Ty = {x i, at 2 , . . . , x r }, si 
la stratigie r r(y) = {[x u tt(*i)], [* 2 , n(x 2 )], — , [x rj 7r(x r )]} est k-optimale\ et 
si les strategies 7 t{x{), . . . , ir{x r ) sont respectivement j\-optimale, ... y j r optimale 
on a: 

T 

2ji<k + r - 1 (16) 

;= i ' 

Les formules de recurrence obtenues en utilisant le theoreme 3 sont les 
suivantes : 

On pose: 

Ji = | J 2^ — ^ + i — i—\,2,...,k 

et 

J (1, =Jx 

L’ensemble J \ est Fensemble des r-uples J correspondant au niveau i dans 
le demi-treillis. Puis on calcule, successivement pour z = 1, 2, A: 


f&Ky ) = opt 2 P(y> x i )[®(y> x d +f ( J‘K x i)] (tf) 

Je J / = ! 

Ji = U I 2 P(y> x dMy> x i) +Ji l'X x i)] =$ ) (y)} ( 18 ) 

i=i 

7i< () (y) = (J 7Zj(y) (19) 

JeJ,* 

j«+i) = (j<o u j i+1 ) — jr (20) 


Ainsi, Fensemble (20) des candidats, lors de la recherche du (i + l)-optimum, 
s’obtiendra a partir du precedent en supprimant les r-uples J correspondant 
a des strategies z-optimales, et en adjoignant les r-uples J du niveau suivant du 
demi-treillis. 

La procedure ci-dessus n’est pas celle qui conduit a examiner le minimum 
de candidats a chaque etape. En effet, supposons qu’on determine successive- 
ment les strategies 1, 2, z, &-optimales, en marquant chaque fois sur 
le demi-treillis les sommets correspondant aux strategies retenues. A Tetape 
z, c’est-a-dire lorsqu’on a deja marque les strategies /-optimales pour/<z, 
il est clair que Pon peut prendre, comme ensemble des candidats a la z-opti- 
malite, Fensemble des sommets du graphe dont tons les ancetres sont marques. 
En effet, si un sommet a un ancetre non marque, il est moins bon que celui-ci 
et ne peut done etre z-optimal. Cependant, la determination des sommets 
n’ayant que des ancetres marques necessite des operations annexes, qui risquent 



195 


STRATEGIES A'-OPTIM^LES IE- PROGRAMMES mNXMIQL'ES 

dc centre balancer, et au-dela, le gain de temps du a la reduction de Tcnscmble 
dcs candidate. Le chon de la rncilleure procedure pour les somrrjcts de hazard 
nccessiterait une experimentation poussee. 

6. EXEMPLE XUMERIQUE 

Lc tableau sutvant donne le calcui des strategics /c-maximales (A< 4) du pro- 
gramme dynamique de la figure 1 La partie gauche du tableau (colon nes 
1, 2 et 3) indique, pour chaquc sommet, 1’c.m. de la \aleur des strategies ct les 
indications neccssaires pour reconstituer celles-ci. Par cxemple, au sommet de 
decision M , la strategic maximale comporte la decision f M,J ), suivie de la 
strategic maximale cn J il \ a deux strategies 3 -maxi males en M, comportant 
respcctivcment la decision ( M y J ) suivtc de la strategic 3-maximalc en J t ct 
la decision (M, K) suivie de ia strategic maximale cn K. La partie drone du 
tableau (colonnes 4, 5 et 6) indique des resultats intermediaires: 

1. Pour les sommets de decision la presentation est la meme que celle 
qui a etc utilisec dans [4]. Pour chacun des sommets ye Tx t la colonne (6) 
donne les valeurs de i(.r, y) -p := 1, 2, 3, 4 et la colonne (5) indique 
les quantites j^(x f y) (voir formulas 9, 12 et 13). 

2. Pour les sommets de hasard y\ la colonne (6) donne pour chaquc sommet 
xie Ty les quantites p(y t X{)[v(y 9 x\ t — /< f) (x/)] f i — 1 , 2, 3 f 4 et la colonne (5) 
donne les ensembles (i) de candidats obtenus a Faide de la formule (20). 

A titre d’cxemplc, nous presentons sur la figure 5 une strategic ^-maximale 
cn Q . On peut remarquer que la strategic 3-ma\imale cn O est marhovienne 
(elle est representee sur Ia figure 5 comme appartenant a la strategic en Q); 




196 


ARNOLD KAUFMANN ET ROGER CRUON 


les deux strategies 2-maximales en O etant non markoviennes (chacune d’elles 
inclut a la fois les decisions (H, E) et ( H , F)) t cette strategic J-maximale en 
O est (2 — *//)-maximale. Or elle comporte la strategic J-maximale en N 
laquelle est egalement (J — ^//)-maximale (toutes les strategies en N sont 
markoviennes). Ceci constitue l’exemple annonce dans le commentaire qui 
suit le theoreme 1 : une strategic ( k — *//)-optimale peut comporter des strategies 
(i — .//yoptimales, avec i > k. 


7. CONCLUSION 

Les strategies utilisees habituellement en programmation dynamique, que 
nous avons appeiees ici strategies markoviennes, ne verifient pas le theoreme 
de A-optimalite indique dans [4] pour les politiques des programmes en avenir 
certain. L’extension de la methode de programmation dynamique a la recherche 
d’un A-optimum necessite un eiargissement de la notion de strategic. Inde- 
pendamment de I’utilite pratique que peut avoir la methode que nous avons 
decrite, Teffort de reflexion qui s’impose lorsqu’on s’attaque au probleme de 
la A-optimalite en avenir aleatoire nous parait enrichissant; ce serait pour nous 
une justification suffisante de notre travail que d’avoir attire Tattention sur ce 
probleme. 

8. REFERENCES 

[1] R. Bellman, Dynamic Programming , Princeton University Press, 1957. 

[2] R. Bellman et R. Kalaba, “On kxh Best Policies”, J, Soc. Indust. Appl. Math., 
8, 4, 582-588 (decembre 1960). 

[3] A. Kaufmann et R. CnuON, La Programmation Dynamique et Ses Applications , 
Dunod, Paris, 1965. 

[4] A. Kaufmann et R. Chuon, “Etude dc la sensibilite en programmation dynamique: 
politique A-optimalcs en avenir certain”, Rev. Franpaise Rech. Operationnelle, 
N° 32, 293-302 (3bmc Trim. 1964). 

[5] M. Pollack, “Solution of the Ath Best Route Through a Network. A Review”, 
jf. Math. Asialysis Applications 3, 3, 547-559 (decembre 1961). 


/r-OPTIMAL STRATEGIES IN FINITE 
STOCHASTIC DYNAMIC PROGRAMMING 

RfiSUMfi 

In a stochastic dynamic programming problem with a finite number of policies a 
method is given to find k subsets of policies (each subset may contain only one 
policy), the first subset containing the optimum policies, the second containing 
the policies that would be optimum if the preceding policies were deleted, and 
so forth. 

The method is an extension of the usual dynamic programming and is based 
on a A-optimality theorem which generalizes Bellman’s theorem. This theorem, 
however, is valid only for a certain class of policies that is more general than the 
usual “ Markovian policies” class. In that class the decision chosen at any given 
time may depend on the whole sequence of states leading to the present one. 
The structure of this class of policies is analyzed. 



Recherche dcs Strategies 
K-optimaIc$, K <4 



*4> 

8 

8 

8 

8 

8 

g 

8 

8 

8 

8 

8 

ro 

8 

8 

8 


D 

8 

8 

8 

8 

8 

8 

cm 

8 

8 

8 

8 

s 

8 

8 

8 

8 

8 

g 


co 

CM 


CO 

cm 

CO 

o' 

B 

04* 

1 

O 

m 

CO 

CM 

in 






1 


1 

co 

■ 


m 




1 

CM 




CM 


CM 


- 


5 G 

£} G Q 

B 


fl 

i 

2 


^ cq 

O U 

C5CQ 

b} 

ft, 

b, 

0 

m 

a 

8 

Si 

8 

Si 

8 

Si 

8 

Si 

8 

Sl 

ro 

8 

Si 

8 

la 

s 

Si 

8 

s 

8 

Si 

cm 

8 

Si 

s 

s 

8 

Si 

LT> 

G 

<N 

5 

I 


25 


2 G 

O 

5 G G 

o 

S 


■ 

c? 

'W' 


TrJ 

D 

1 

a 

1 

m 



i r> 

s 



n 

b* 


*■*> 



<r 

ro 

c 

CO 


197 

















































































































Recherche des Strategies 
K-optimalcs y iv — suite 


o 

N ✓ 

B 

8 

8 

8 

8 

8 

CO 

8 

8 

8 

8 

8 

CM 

to 

cm" 

- 

LO 

CD 

cm" 

CM 

- 

CO 

*3" 

O 

H 

CO 

/-\ 
LO 
> — / 

a 

fO CS rM *t CO CM 
*"-1 * — , ^ *-»( 

cm" C)%h cm" co r}- 

ttj &5 !q as Cq £q ft! 

EtS 


CO 

wm 

CO 

as 

CM CO CM — 

►-M ^ 

rH N (S 

e aj &s &j 

CM 

CM *— ( 

*-< 

— T CM 

aj a? 

CM 

as 

CM — < 

t-T cm" 

as £q 

T~< 

5 

55 

5 

/ S. 


133 


as 

as 

D 

CO 

a 

to 

CO 

772 

12 

8 

S) 

CM 

CO 

S jN 

CO 

a 

II 

I H 

8 

Si 

vO 

CO 

C CM 

EtJ *1 

CM 

in 

vo" 

CM _ 

LO 

CM 

&3 

CO 

vO" 

HI 

71 

- 

a 

55 a; 

vO 

5 

CM 

r-~ 

5 a; 

o? 

' — ' 

Q 

Cri 

*50 


H| 


in 

•54 


— > 

N 




✓ — s 

O ' 

o 

to 

a 

X 

PL, 

E 

CM 


198 













































































199 


































Recherche dcs Strategics 
C -optima!cs % K £ A — suite 



200 



ADVANCE? IN TECHNIQUES OF MODELING 


201 


The Organization of an Industrial System 
as a Servomechanism 

IJ Organization d'un Ccmplexe Indusiriel 
Cointne un Servo-mecar.isme 

H. J. Grunwald 

Philips Research Laboratories 
Eindhoven , The Netherlands 


The structure of the organization of a large industrial complex is considered 
as a feedback control system. For that purpose reality must be reduced by 
drastic simplification. There are only a few kinds of level, flow, transformation, 
control procedure, and decision rule in the model of a system. 

The concept of industrial dynamics (developed by J. W. Forrester) des- 
cribes the behavior of such a system by relatively simple differential equations. 
In this article the behavior is described by difference equations and logical 
equations to get a better approximation without losing the possibility of analyt- 
ical treatment and solution. 

The formulation of the equations is illustrated by a case study of a compli- 
cated industrial system. It is done in subsequent stages by mapping the topo- 
logical structure, designing the flow diagram, and formulating the set of equations, 
or, alternatively, the set of FORTRAN statements. Moreover, special functions 
are defined as generally useful for such systems. 

Analytical techniques of control theory' and the concept of stochastic pro- 
cesses as well as simulation techniques arc used to describe the behavior of 
the system in a changing environment. The behavior is considered as a walk 
in the phase space, connected with certain figures of merit (e.g., the mean 
deviation of the production or maximum jumps of the production). 

It is shown that such a model is a useful tool in the study of various prob- 
lems by experiment: 

1. The effect of a more or less precise approximation of the real system. 

2. The effect of applying different operations research techniques to 
forecast the demand and to solve the production smoothing problem. 

3. The sensibility of the system. 

4. The construction of a utility function by the reactions of the managers 
on the changes of the figures of merit. 

Finally an over-all optimalization is discussed. 


La structure dc Torganisation d’un grand systemc industricl cst caractcrisce 
commc un systemc dc reaction. Pour cc modclc, il faut rapetisser Ie reel. II 



202 


ABSTRACTS 


y a peu de niveaux, courants, transformations, procedes de reglement et de 
regies de decision dans le modele du systeme. 

La theorie de la dynamique industrielle (de J. W. Forrester) caracterise 
un tel systeme par des equations differentielles assez simples. Dans F expose 
ci-dessous le systeme est caracterise par des equations de difference et des 
equations logiques afin d’obtenir un meilleur rapprochement sans perdre la 
possibility des solutions analytiques. 

L’expression de ces equations est demontree par une etude d’un systeme 
industriel precis. La description s’accomplit par echelons: le dressage de la 
structure topologique, le dessin des courants, la formulation de l’ensemble 
des equations et tour a tour le programme en FORTRAN. D’ailleurs des 
fonctions speciales utiles a telles systemes sont definies. 

Des techniques analytiques de la theorie du controle, et des procedes sto- 
chastiques, de meme que les techniques de simulation sont employes pour la 
description du systeme dans une ambiance qui change. 



SESSION A 


THEORY OF GRAPHS 
Thcorie des Graphes 
Chairman: C. Berge (France) 




FLOWGRAPHS APPLIED TO CONTINUOUS 
GENERATING FUNCTIONS 

Graphes des Flots Appliques 
aux Fonctions Generatrices Continues 

C. S. Lorens 

Aerospace Corporation, El Scgundo , California 
United States of America 


1. INTRODUCTION 

Flowgraphs arc an analytic tool often used in the modeling and analysis of 
linear systems. An appropriate formulation of a continuous Markov system 
with generating functions results in a flowgraph identical to the Markov system 
and represents a number of its properties. This article describes the flowgraph 
formulation of continuous Markov systems in terms of generating functions of 
the transition-time probability densities. 

The consideration of transient systems produces the average transient 
duration and variance to individual closed sets. Transient analysis is applicable 
to reliability analysis and mission- or work-completion analysis. 

The consideration of recurrent systems produces the stationary state distri- 
bution* and an analysis of these systems is applicable to the continuous opera- 
tion of a service unit subject to periods of standby and repair. 

This article introduces the terminology applicable to continuous Markov 
systems. By the use of generating functions and the development of transition 
densities a set of simultaneous linear equations is obtained which in flowgraph 
notation is identical to the Markov system. Appropriate analysis of the flow- 
graph then leads to the properties of the Markov system. A number of examples 
that deal with mission completion* reliability* and operation of a service unit 
arc included. 

The use of flowgraphs in the representation and analysis of a simultaneous 
set of linear equations is a technique that is being used in an ever enlarging 
set of applications. A treatment of the development and application of flow- 
graphs is given in [1], which also contains an extended list of references in 
which additional material may be found. 

2, ONE-STEP TRANSITION PROBABILITY DENSITIES 

A Markov system is n set of the possible states of a system and the transition 
properties associated with its movement from one state to the next. In a con- 
tinuous Markov system the transition from state S( to state sj is characterized 
by the onc-stcp transition-time probability density ptj(r) at the elapsed time 

205 



206 C. S. LORENS 

r. The one-step transition probability py is then the integral 

f°° 

Pi) = Pi)(r) dr. 

J o 

It follows that the first and second moments of the transition time r*/ and t , ^ 
are 

— 1 r OT 1 r°° 

ni=— Tpijirfdr, nj 1 — -— r 2 pij(r)d r . 

Pij J o Pif J o 

The first and second moments of the state duration of the state St are then 

m , m 

n— 2 Pt) T i)> T i 2= 1 Pimp, 

i= l l 

where the summation is over the states of the Markov system. 

The n-step transition-time probability density pij n ( t) of moving from state 
Sf to state sj in n steps at the elapsed time r is the convolution of the n — 1 
and the one-step transition-time probability densities summed over the states 
of the system 

m pco 

pt} n {r) — 1 P’ir~\o) Prj(T —o) do. 
r = 1 J 0 

This is a recurrent set of relations in which, for convenience, the zero-step 
transition-time probability density is defined as 

Pi)°(j) = 8i } (t), 

where S*/(r) is the Kronecker delta function 

Sy(r) = 0, r^O and/or i =^j, 



3. CONTINUOUS GENERATING FUNCTIONS 

The n-step continuous generating function Pij n {s) of the ?z-step transition-time 
probability density pij n (r) is defined as the Laplace transform of p(j n (r). 

Pts"(s)= Pif{r)e^dT, 

J o 

It follows that the one-step transition probability py is 

p(} = Ptj(s)\s= 0 , 

and the first and second moments of the transition time are 



rLOWGR * FH? \FPLIH) TO CONTIXTOCS GENERATING FfNCTIONS 207 

A convenient method of obtaining these parameters is by lone-hand division 
to place the generating function in the power series 

P%}(s) = pf/Cl m* $ Jrf/V ). 

Introduction of the n-step continuous generating function Pt~(s) into the recur- 
rence relation for the n-step transition-time probability density pro- 

duces a set of recurrence relations m terms of products and summations over 
the states of the system. 

ir(*)= 5 p;r^)Pr(s\ i= i «. 

r - 1 

W) = Sf/- 


4. TRANSITION DENSITIES 

The transition dinsity r;*,(r) of the rate at which the system moves from state 
Si to state $] at the elapsed time r in any number of steps is the sum of the 
«-step transition-time probability densities 


This relation is obtained by the use of the characteristic function to 

indicate the elapsed time in the system moving from state s* to state sj on the 
nth move of the rth experiment. The transitions on any move of the rth experi- 
ment are then represented by the infinite sum 

*Tf(r) r = 2 £u~(r)r. 

r « n 

The transition density n;f ; (r) is then the average 

1 * 

w«(r)= lim j ^ v ?r(r). 


Jim 






• Z pin r). 

p» o 

The average number of transitions r:*/ between states s* and s* is the integral 

kuj = | n, (r) <fr. 

•0 

We define the generating function Mt<{s) of the transition density n,;(r) 
as the transform 

-V|,W= f «i»(r)'-*'<ir 

5 p,n*h 



208 C. S. LORENS 

which then produces the result that the average number of transitions ?n ( j j s 

m { j — Mij(s) 1 5=0 

= I 

n = 0 


5. SIMULTANEOUS LINEAR EQUATIONS 

The substitution of the recurrent relation for the generating functions of the 
/z-step transition time densities Pij n (s) into the relation for the generating 
function of the transition density 


CO 


M {} (s)= 2 P i} »(s) 

n = 0 



Figure 1- A flowgraph representation of a linear equation for the 
generating function of the transition density, 

produces the set of simultaneous linear equations 

m 

Mij(s) — Si) + 2 Mir{s) Prj(s), J = l, m. 

r= 1 

In this set of equations the coefficients are the generating functions P T j(s) of 
the one-step transition time densities and the variables are the generating 
functions Mt r (s) (r= 1, . tn) of the transition densities. Figure 1 is a flow- 
graph representation of this equation. The first subscript of the variable M( r (s) 
indicates the state st in which the system started. The first and second sub- 
scripts of the coefficients P r j(s) indicate, respectively, the independent and 
dependent variables. 

A second set of simultaneous linear equations is obtained from the first set 
by setting the variable s equal to zero. 

rn 

mi j=8tj+ 2 mirprjy 
r= 1 

The average number of transitions my (j — 1, . . . , m) between states Sf and s j are 
variables of a flowgraph corresponding to a Markov system characterized by 
its transition probabilities p r j. 



FLOWCRAPHS APPLIED TO CONTINUOUS GENERATING FUNCTIONS 


209 


6. TRANSIENT SYSTEMS 

There are two methods for calculating the average duration of a transient system. 
The first is based on the fact that the average duration of any state sj is the 
average number of transitions into the state times the average duration of 

the state tj. The duration d{ of the transient system starting in state S( is then 
the sum over the transient states 


m 

di = T 

>~i 

The second method introduces the generating function Piaif) of the exit 
probability density pta(r) into the closed set of states S a . It follows that the 
probability of entering the closed set S a is 

pia = P{c(s)\g- Of 

and the first and second moments of the duration are 

-Id Id 2 

di* ~ p " ^ P<a\ s ) It ~ o * 

The power series expansion of the generating function P(c(s) produces these 
parameters as 

The first and second moments of the duration are then 
4 = I Pilin' d'f = 2 p te 4 2 >, 

A *«1 c*» 3 

where the summation is over the set of closed sets, 

7* MISSION COMPLETION MODEL 

A convenient model of the completion of a mission is that of Figure 2 in which 
the mission is continually repeated until it is a success (sec [2]). The failure 
transition time density pj{r) Is thus associated with the loop, and the success 
transition time density pd?) is associated with the transition out of the transient 
system. 

The flowgraph representing the single equation of the generating functions 
of the transition density is also given in Figure 2. 



Ff(r) 


Fdr) 


-O 5 



Firure 2. A mission completion model end its atsocisted fioveprsph. 



210 


C. S. LORENS 


It follows that 


M(s) 


1 

1 -P f (s)' 


from which the average number of transitions m is the reciprocal of the success 
probability 

1 1 

m = z = — 

1 —pf Ps 

and the average duration is 


d= P,~ T ‘ + f. *'• 


where the state duration f is the average of the average failure time t} and success 
time Tg • 

Use of the second method in which the generating function 


P(s) 


Psjs) 

1 -PA*) 


of the exit probability density is used produces the same result for the average 
duration d. From the generating function of the exit density it follows that 
the second moment of the duration is 


J(2) = T 5 2 _|_2 jyT/T s + r/ 2 + 2 

ps ps 




so that the variance aa 2 of the duration is 


<*£ = 


2 + — a T 2 + 2 
5 T / 

Ps 7 



The mission completion model is also a model of a service system that 
extends the life of the system as long as it does not fail in the interval of time 
between service. The change to the model of Figure 2 is one of introducing 
failure for success and service for failure. 


8. RELIABILITY 

A second application of the transient system analysis is to reliability theory' 
(see [3]). The reliability r(r) of a system as a function of the elapsed time r is 
defined as the probability that the system is operating at the elapsed timer. 
Based on the assumption that the time of failure is a random variable with a 
failure probability density p(r), the reliability is the probability that the system 
will fail to operate properly after the elapsed time r: 

r(r) = prob {failure after r} 



p(a) da . 



rLOWGRAPUS APPLIED TO CONTINUOUS G KNOTTING FUNCTIONS 21 1 

It then follows that the failure probability density p(r) is the negative derivative 
of the reliability: 

P(r)=-j 7 r {r). 

The average operating time or duration f is then the average 

r® 

f = I a p{a) da 
*o 

r® 

== r(a) da, 

*o 

In terms of the generating functions 

^ cc -oo 

7?(j) = I r(r)^~ FT ^r and P(s) = | p(r)e"" ,r rfr 
*o 

of the reliability r(r) and the failure time density p{r ), respectively, the reli- 
ability generating function is obtained by the relation 

R(s)^l[\-P(s)}. 

The average operating time is then 

f = i?(i)|,-o, =-~P(s)\,. 0 . 

The standby failure model of Figure 3 is a system that uses identical pieces 
of equipment sequentially until they fail. The elapsed time in the failure-time 
density for any one piece of equipment is counted from the time that the system 
begins to use it. Failure of the system then requires a sequential failure of all 
of the pieces of equipment. 


Pi(?) P\(t) Pi(~) 

o > . o — * — o— > o 

.*0 *1 *2 $3 

No failure One failure Two failures Three failures 


1 o 


+ 1 


Plfs) 




PM 


MM 


MM 


MM 


-O P(s) 


Figure 3. Standby failure model ns n transient Markov system 
and its associated flow graph. 


The generating function P(s) of the failure-time density of an nth-order 
standby system is then 


Pn{s)=Pl*®' 



212 


C. S. LORENS 


It then follows that the average operating time v n is 

T n — nr i 


and the variance is 


By the use of the inverse transform it is possible to obtain both the rcliabilitv 
and failure-time density as a function of the elapsed time r. 


9. STATIONARY-STATE DISTRIBUTIONS 

In a recurrent system the identity of the initial state soon becomes lost as the 
system settles into a stationary-state distribution. There arc three fcno%vn 
techniques for obtaining the stationary-state distribution. 

The first technique is by the application of the final value of the transition 
density to a particular state times the average time the system remains in the 
state; that is, the stationary-state probability P(s;) that the system will be in the 
state Sj is 

P(sj) = m l i(co)T j , 

where the final value of the transition density is obtained from its generated 
function M{j(s) by the relation 

mij(cfj) — lim s 

i— o 


The second technique is based on transforming the recurrent system into 
a transient system by splitting the state Sf. The stationary-state distribution 
P(s() is then the ratio of the average state duration r\ and the average duration 
d{ of the transient system 


P(st) 




dt' 


Within this transient system each state sj occurs on the average my times each 
time with an average state duration r] . The stationary-state distribution is then 


m 


mijTf 

d t ' 


The third technique uses the cofactors A((r) of the flowgraph representation 
of the recurrent system. The stationary-state distribution is 


P(sj) = 


ziM Q) 

m * 

2 ?JA<(0) 


where the summation is over the recurrent states. 



FLOWGRAPHS APPLIED TO CONTINUOUS GENERATING FUNCTIONS 


213 


Operate 

sa + 1 Mu/ **) 



Figure 4. An opcrate-repmr system and its flow graph. 


10. OPERATE-REPAIR MODEL 


An example of a recurrent system is one that is cither in an operating state 
or in a state of being repaired. A model of this system is shown in Figure 4, 
in which the transition time densities p f ,(r) and /> r (r) determine the average 
length of time and t> spent in the operating cycle and the repair cycle, re- 
spectively. 

Solution of the fiowgraph of Figure 4 produces the results 


iWoo($) : 


Pr(s)Po(s)' 


— ■ 


Pr(s) 

Pr(s)Po(sy 


from which the stationary-state distribution is obtained 




11. OPERATE-REPAIR MODEL WITH STANDBY 

A more complicated example of a recurrent system is one that remains in a 
standby state except when it is operating or being repaired, as in Figure 5. 



Figure 5. An operatc-repair model with standht state. 



214 


C. S. LORENS 


The flowgraph in Figure 5 produces the generating functions M ss (s), M $0 {s) 
and M sr (s ) from which the following stationary-state distribution is obtained: 




TS 

r s “h PsoTo + (1 psopos) r r 


P(*o) = 


pso T o 

T S -\-psoTo + (1 — Psopos)Tr 


P{*r) = 


(1 — pzopazp'x 
Ts pso r o + (1 psopos) r r 


where r Sy r 0 , and r r are the state durations of the standby, operation, and 
repair states, respectively. 


12. CONCLUSION 

The use of flowgraphs provides a technique for analyzing the properties of 
continuous Markov systems. Appropriate formulation with generating func- 
tions produces a set of simultaneous linear equations with the generating func- 
tions of the transition densities as the variables. The analysis leads to results 
relating to transition probabilities, transient durations, and stationary-state 
distributions. 

Applications of continuous Markov systems include mission completion, 
inspection, reliability, and operate-repair with standby. 


13. REFERENCES 

[1] C. S. Lorens, Flowgraphs, for the Modeling and Analysis of Linear Systems, McGraw- 
Hill, New York, 1964. 

[2] C. S. Lorens, “ Subsystems Reliability Costs,” Communications Systems Technical 
Rept. No. 23 (August 26, 1965). 

[3] C. S. Lorens, “The Use of Redundant and Standby Subsystems/' Communi- 
cations Systems Technical Rept. No. 21 (August 10, 1965). 


GRAPHES DES FLOTS APPLIQUES 
AUX FONCTIONS GENERATRICES CONTINUES 


RESUME 

Les fonctions generatrices continues qui caracterises les chaines de Markov 
peut etre simplifiee par une representation graphique. Les multiples pos- 
sibilities des densites de probability de passer d’un etat a Tautre par etapes 
successives pendant le temps considere sont remplasables par des fonctions 
analogues qui passent d’un etat a l’autre en une seule etape mais par anneaux 
multiples. Ces anneaux, a leur tour, sont remplasables par des produits de 



STOCHASTIC NETWORKS AND RESEARCH PLANNING 


215 


fonctcurs. L’cmploi dc fonctionnelles dans un systeme dc Markov le trans- 
forme en un systeme d ’equations Iineaires qui peut ctre represente par un 
graphique simiiaire au graphique des chatnes de Markov. Pour les svstemes 
transients, un fonctcur peut ctre trouve par la methode graphique a partir duquel 
on obtiendra les moments de duree du premier et du second ordre. Pour un 
proccdd recurrent, la methode graphique donnera un fonctcur nvec lcquel on 
pourra definir la repartition statistique de Petat stationnairc. 

Les chaincs dc Markov sont utilisees notamment pour calcuicr la probabilite 
d’achcvcr la mission spccifiee, ou pour predire la disponibilite d'un systeme 
au moment voulu ou bien encore pour determiner les taux d’inspection neces- 
sairc pour ameliorer sa disponibilite, pour lesquels la duree moyenne ct la 
variance ont un interet particulier. La presente methode cst tres utile aussi 
pour determiner la disponibilite d’un systeme reparable avee un certain nombre 
dc pieces dc rcchangc. 


STOCHASTIC NETWORKS AND 
RESEARCH PLANNING 

Reseaux Stochastiques et 
Planification de la Recherche 

Burton V. Dean 

Case Institute of Technology 
Cleveland , Ohio, United States of America 


1. STOCHASTIC RESEARCH PATHS 

Planning methods for research projects and tasks have been studied (see Refer- 
ences). Because research is undertaken in the hope that the knowledge gained 
may lead to innovations such as new products, new processes, and better theories, 
it is essential that the research planning become part of the research and develop- 
ment (R and D) effort. 

The individual research task is usually under the control of a principal 
investigator, who is considered to be responsible for its success or failure. 
The principal investigator is responsible for setting the task’s goals and subgoals 
as well as for planning controlling, and evaluating its progress. Although he 
is usually free to attack and solve the problem in any manner he secs fit, he is 
always subject to constraints of time, manpower, facilities, money, and lus own 
ability and the ability of those under him. It is necessary' to conduct the research 
in a manner that offers the highest probability of success, subject to the resources 



216 


BURTON' V. DEAN 


Sq - 




-S2 i"Sz 


S, 


S, 


“^3 Goal 


{ 


1 2 3^ 5 6 7 8 9 10 11 12 

Ttme (months) 

Figure 1. Sequential time-phased stages of a research project. 


and ability constraints. To accomplish this objective the principal investigator 
has to be sure that (a) the approach or approaches taken are best suited to arriving 
at a solution to the research problem and (b) that an approach itself is followed 
in the “best” possible manner. Thus the planning and control of a research 
task involves the planning and control of one or more alternative approaches. 

It is possible to define at each stage the goal of a research project. If this 
were not so t there would be no research but only random aimless search. In 
order to reach this goal, intermediate stages of progress or accomplishment 
must be achieved. 

Suppose it were also possible to define, in advance, the specific inter- 
mediate stages or milestones. Consequently, it would be possible to order each 
stage sequentially in terms of accomplishment; that is, it would be possible to 
say which must be accomplished first, which second, and so on. The research 
project progressively passes through stages of completion until the goal is 
reached; for example, suppose that an already completed research task had an 
ordered sequence of stages of completion and the time to reach them as given 
in Figure 1. 

Figure 1 indicates that the entire project required one year; two months 
to reach the first stage of completion Su an additional three months to reach the 
second stage S 2 , and so on. If this process of the occurrence of successively 
acquired stages is plotted on a vertical axis against the time required to acquire 
them on the horizontal axis, the result would be a graph, as in Figure 2. 

Figure 2 represents the historical path for achieving successive stages of 
completion that was actually followed in the project. The completion of any 




STOCHASTIC NETWORKS AND RESEARCH PLANNING 


217 


St3ges of 
completion 
(milestones) 


Goal 


Stage 

Stage 

Stage 



Figure 3. 


Family of project progress paths. 


stage, however, is not fixed in time; for example, stage 1 could have occurred 
after one or three months of work. The time of completion of each stage is 
variable and a function of resource inputs such as money and manpower. 
Although a research project follows one particular path in the time-stage 
relationship, an entire family of paths was originally possible, any one of which 
could have been followed. Figure 3 represents such a family or distribution of 
timc-stage-of-complction relationships. 

The planning of a research project is the process by which one path is se- 
lected. Although one or more paths in this family of paths is the best one to 
follow, it is not possible to define the criterion of “best” here or to select in 
advance the 41 best ” path. It is evident, however, that some ways of conducting 
a research project arc better than others. If it is possible to compare one method 
with another and say that one is better than the other, using a transitive relation, 
it is also possible to find one method such that there is none better. 


2 , STOCHASTIC NETWORKS 

Perhaps the most basic of all techniques used in development planning is the 
critical path method (CPM), a deterministic network tool used for planning and 
scheduling. It is deterministic in that it assumes that the sequence of events and 
activities of the project arc known with certainty, along with the length of time 
needed to complete each activity. The sequence of events and activities con- 
necting the events is represented by a directed, acyclic network. Since the time 
to complete each link in the network is known and fixed, the longest path through 
the network can then be calculated with certainty. The activities along this 
critical path need special attention, for the completion of the project on time is 
dependent on their being completed on time. 

A second deterministic network technique is a critical path-minimum cost 
method. This method also assumes that the sequence of events and activities is 
known with certainty. Now, however, the time to complete any activity is not 
fixed but can van’ over a known discrete range. This model assumes that the 
cost of performing an activity is a decreasing linear function of the length of 
time needed to perform that activity. Thus it is possible to compute the total 



218 


BURTON V. DEAN 


cost of completing the project for any set of completion times of the activities 
For a given length of time to complete the project it is possible to find the set of 
activities that minimizes the total cost of completing the project. Under the 
assumption that there is a delay cost, which is linear with respect to the length 
of time needed to complete the project, it is possible to determine the optimum 
completion time with respect to minimizing total cost of completion and delay 
and the schedule of activity times that gives that optimum. 

A third planning technique is known as PERT /TIME. This method 
assumes that the sequence of events and activities is known with certainty, but 
the completion times become a random variable. The time needed to complete 
an activity is subject to a probability 7 distribution. For computational purposes 
the mean completion time of this distribution is used to find the critical path. 
It is then assumed that the sum of completion times along the critical path 
(hence the project completion time) is a normal distribution with a mean and 
variance equal to the sum of the means and variances, respectively, along the 
critical path. 

These three techniques are considered to be realistic models of a develop- 
ment project. They require an assumption, however, that is not realistic when 
considering the application of these techniques to research projects. This 
assumption is that the sequence of events and activities is known with certainty. 
In a research project the events themselves are not known in advance with 
certainty, much less their sequence . Consequently, these techniques are not 
directly applicable for use in planning and controlling research projects in which 
a model that considers only the statistical determination of completion times 
for activities would be insufficient. 

2.1. Nature of a Research Project 

An investigation into the nature of a research project reveals that it is similar 
to a development project in one respect. A research project as a whole is com- 
posed of a set of small specific research tasks . Each is undertaken to proride 
an answer to a specific question or to provide information that is needed as a 
part of the answer to the over-all question being investigated. The successful 
completion of a research task results in the research project being closer to its 
goal. 

Some research tasks may take place simultaneously or in parallel. During a 
given period of time the researcherf may be investigating more than one aspect 
of the problem in which answers to two or more questions may be sought at the 
same time. In this case the research tasks may be thought of as occurring in 
parallel. 

Other research tasks may be dependent on one another in that one may be 
undertaken only after a previous one has been completed. It also needs the 
results of more than one previous research task, and likewise it may be the ante- 
cedent of more than one. In this case they may be thought of as occurring 
sequentially or in series. 

t The word “researcher” is used to describe an individual or a group that may be 
working on the project. 



STOCHASTIC NETWORKS AND RESEARCH PLANNING 


239 


2.2. Stochastic Network Concept 

The research project can he described by an ordered series-parallel system, 
a concept directly analogous to that used to describe development projects. It 
is this concept that leads to the introduction of a network. In the application of 
network theory to research projects, as contrasted to development projects, a 
degree of uncertainty is present in the network. 

As an illustration of the development of a network, suppose the researcher 
has defined a goal for his research that represents the knowledge he wants to 
acquire as a result of the project and that he can define a number of prior states 
or areas of knowledge from which he would be able to acquire his goal. For the 
sake of illustration assume that there are three prior knowledge states. Suppose 
now that the researcher can define knowledge states or areas on which each of 
these three knowledge states is dependent. Continuing in this manner, at some 
point the knowledge states last included in the network would be dependent on 
knowledge that the researcher already has. At this point a network may repre- 
sent the sequence and dependency of the knowledge states that the researcher 
foresees to be necessary* in order to reach the goal of the project. This new degree 
of uncertainty in the network arises from the uncertainty associated with the 
nodes themselves. Both the nodes and the links of a research network arc subject 
to uncertainty. The nodes of the network represent specific “ areas ” or “ levels ” 
of knowledge; for example, the researcher might need knowledge about this/* 
and “knowledge about ‘this’” becomes a node in the network. The event of 
reaching a node occurs when the researcher feels subjectively that he has gained 
sufficient knowledge of “this” to proceed to other investigations. The links 
of the research project network represent a knowledge-research dependency. 
The researcher might say that in order to get knowledge (node) j he must first 
have knowledge (node) / and then perform a research task. A research task 
which, based on previously acquired knowledge (node) z, is performed to acquire 
knowledge (node)/, is a link (/,/) in the network. Thus a link in the research- 
project network represents both a knowledge dependency and an actual research 
task. A dummy link is defined as one that represents only a knowledge-state 
dependency. In other words, some knowledge must he acquired before other 
knowledge states can be acquired, but there it no direct research link between 
the two. To complete the definition of the network, the origin node is defined 
as all the pertinent knowledge with which the researcher begins the project, 
and the terminal node is defined as the knowledge goal of the research project. 

The uncertainty in the research-project network arises not only from the 
uncertainty about the time to complete link (ij) but also from the uncertainty 
whether nod cj can be reached at all, and, if it is reached, whether the acquired 
knowledge will he useful to the researcher in achieving the project goal. This 
uncertainty produces a change over time in the configuration of the network 
itself. Unlike the nodes of the deterministic dcvclopmcnt-projcct network, 
which remain fixed over time, the nodes of the stochastic rcscarch-projcct net- 
work are themselves subject to change. Over time some nrxles may drop from 
the network and new nodes may also be added. The result js a change in the 
links of the network also. 



220 


BURTON V. DEAN 


The causes of the changing configuration may be internal or external to the 
project. The internal causes stem from the results of research tasks currently i n 
progress or already completed; for example, the researcher may decide that a 
certain research task will not be able to provide the planned knowledge Con 
sequently, that node will drop from the network and other areas of knowledge 
(nodes), which will act as substitutes for the knowledge that could not be 
acquired, will have to be defined. As a result of the new configuration of nodes 
new research tasks will have to be defined in order to acquire the new areas of 
knowledge (nodes). The external causes of change are of the type by which know- 
ledge acquired outside the activity of the project causes the configuration of 
nodes to change; for example, an outside discovery of a fact may result in areas 
of knowledge no longer needing to be investigated, and a number of nodes and 
links would be removed from the network. 

The research-project network is a directed acyclic network. There is only 
one node that has no links leading into it (origin) and only one node that has no 
links leading out of it (goal). These properties must hold true before and after 
every configurational change that may occur. To complete the project every 
path through the network must be followed. 

An example of a directed acyclic research network that represents the plan 
of a research project is given in Figure 4. Node 0 represents the present level 
of pertinent knowledge of the researcher. Node 7 represents the goal of the 
research, the desired knowledge. The intermediate nodes represent knowledge 
states that must first be acquired in order to reach the goal. The links, solid 
arrows , represent the knowledge-research dependencies between the defined 
knowledge states ; for example, to reach knowledge state 1 the researcher must 
perform a research task (0, 1), which is based on his present level of knowledge. 
The dashed links represent the dummy tasks which are strict knowledge de- 
pendencies. The network is directed (i.e., a knowledge-research dependency 
holds in only one specified direction) and it is acyclic (i.e., there are no closed 
loops). 

In a research project it is impossible to plan with certainty the exact 
sequence of events that will occur; that is, the research network that will, in 
fact, have been followed is not known in advance. If the exact network could be 
known in advance, the problem of planning and controlling research would 
be reduced to the problem of planning and controlling development projects. 
It is this uncertainty about the knowledge that will be needed and the sequence 



Figure 4. A directed acyclic network. 



STOCHASTIC NETWORKS AND RESEARCH PLANNING 


221 


in which this knowledge is to he acquired that is the essence of research and the 
characteristic which distinguishes it from development and other R and D 
activities. The inability to plan with certainty, however, should not prevent the 
researcher from engaging in planning research to reduce the costs of uncertainty. 
At the beginning and throughout the life of the project the researcher has access 
to information that could be used in planning and controlling the research 
project. The quantitative models developed in subsequent systems arc of a 
nature that make use of the increasing information available to the researcher 
as the project progresses. 

2.3. Research Tasks 

Consider a simple research project made up of two research tasks in series. 
The network for such a project is given in Figure 5. The researcher at time to 
is at the knowledge state represented by node 1. 

He plans first to acquire the knowledge represented 
by node 2, and, once that knowledge is acquired, 
to acquire the knowledge represented by node 3. 

The over-all objective of the researcher is to reach 

* 1 I^UKL 

(lie knowledge state represented by node 3. Thus Simple serial network, 
he is concerned with the following uncertainties. 

1. Can he reach the knowledge state represented by state 2? This question 
is concerned with the ability of the researcher to acquire the desired knowledge 
state. 

2. If lie cannot acquire the knowledge represented by node 2, can he 
define one or more alternative knowledge states which, when taken together, are 
a substitute for the originally desired knowledge state? This question is con- 
cerned with the substitutability of the knowledge states. 

3. Given that lie has reached state 2, can he now acquire the knowledge 
represented by state 3? 

The stochastic element of the research network can be expressed by examin- 
ing the possible outcomes that can arise from the researcher's uncovering answers 
to the previous questions. This analysis will consist in looking at the general 
research task (t\j) in the network. Assume that the researcher is conducting 
research task (i,j) in an effort to go from knowledge state i to knowledge state j. 

case 1. Knowledge state j is reached. This occurs when, after spending an 
amount of time on research task (*,/), the researcher decides that he has acquired 
the desired knowledge state j . The researcher is now in a position to start any 
research tasks that have node j as their starting node. In this ease the network 
has not changed, and the researcher has so far accomplished what he had planned 
to do. 

case 2. State j cannot be reached directly, but an alternate statc(s) can be 
defined. After having worked a period of time on research task (*,/), the re- 
searcher feels that he cannot acquire the knowledge state j , but that he can 
define one or more alternate states of knowledge tvhich together substitute for 




222 


BURTON V. DEAN 


the original planned knowledge state j. In this case, the network will change 
For the sake of illustration, assume that the research can define two alternate 
knowledge states, jl andj*2. The original research link of 

© — © 

will change into one with the configuration 



The node il represents the knowledge state that the researcher had acquired 
in the time spent on the original research task The research task (?, il) 

is now history; that is, the “now” for planning purposes is node il. The 
researcher plans to conduct two parallel research tasks (71, jl) and (tl,j2). If 
and when both knowledge states jl andj'2 are acquired, they can be combined 
(2 node) and will be equivalent to the originally planned knowledge state j. 
The dashed links represent knowledge dependencies. (They are dummy links 
analogous to those of PERT and CPM.) A situation may develop in which, 
after such a configurational change, node i\ will coincide with node i. This 
happens when the researcher has acquired no useful information while con- 
ducting the original research task (i,j). 

case 3. State j cannot be reached and the researcher can define no alternate 
states, nor can he define any whole new plan of attack. In this case the research 
project ends in failure. It becomes impossible to complete all paths of the 
network. 

These three cases exemplify the basic ways in which the configuration of 
the network may change over time. This configurational change may take place 
because of internal or external changes. It is important to point out that this 
article is concerned with the way a research network may change and not in the 
causes of the change. The uncertainty associated with the causes of change is 
too great to be subject to a quantitative analysis. In other words, why a researcher 
may decide a knowledge state cannot be reached is not under study. Rather, it is 
important to consider what happens when such a decision is made. This is done 
by the use of the subjective knowledge of the researcher. 

2.4. Definitions of Relevant Variables 

For a given research task (7,/), the following definitions are made. 

pij— the probability of going from knowledge state i to knowledge state j, 
given that the researcher is now at knowledge state i. 
qj = the probability that the alternative state or states of knowledge for 
state j can be defined if state j cannot be reached. 


STOCHASTIC XFTUORKS AND RESEARCH PEAVVING 


223 


/rs=thc number of failures that can occur during each research task (f,j); 
n “failure” is defined as that point at which a researcher decides that 
knowledge state; cannot be achieved as it is presently stated. 

The probability /»/; is a measure that is subjective to the researcher. There are 
two possible methods, either of which would be useful m arming at a value of 
/>fj. The first is a Bayesian approach. The researcher would be placed in the 
hypothetical position in which he would have to make a choice between under- 
taking the research task (/,;) in the hope of acquiring knowledge state j or select- 
ing a lottery: the probability of p wanning event A % in which case he would be 
given the knowledge represented by state ;, and the probability of 1 — p winning 
event B , in which case he would not get (and could not get) the knowledge 
represented by state ;. In other words, the researcher is faced with the choice of 
undertaking the research task (*,;) in an effort to acquire the knowledge repre- 
sented by state; or selecting a lottery in which he has a probability/) of being 
given that same knowledge. Bayesian statistics show at least one value of p 
for which the researcher would be indifferent between choosing to undertake 
the value of /> that would be a measure of the probability pij. The principal 
drawback in this method is that the value of /> may not be unique, for the re- 
searcher may be indifferent between the two choices for a range of values of />. 
In this case, however, it may be possible to use some average value taken over 
the range of indifference. 

The second possible method of measuring the probability pi; is based on the 
researcher's experience. It would be necessary* for the researcher to develop a 
set of categories of research tasks and to classify all his past research tasks into 
it. He would then he asked into which category research task (i,;) would best 
fit. The proportion of successes of that selected category would then be a measure 
of the probability />/;. This is a good way of arriving at the value of because 
the researcher would be using his own experience to categorize the ta^ks. It 
would consider his own judgment of what would be involved in undertaking the 
task (/,;). These same two measures could be also used to arrive at values for 
the probability measure p\j\ 

The probability 9; is a measure of the substitutability of state The greater 
the value of 9;, the more confident the researcher that alternative states could 
be found if necessary. Likewise, the smaller the value of 9/, the more confident 
he is that it is important to acquire the knowledge of state ;. 

2.5. Probability Calculations 

Under the assumption that only one alternative is defined and one failure is 
allowed and 

p* '■-the probability of going from knowledge state / to knowledge state j 
or its alternative, given that knowledge state i has been acquired, 

wc have that 

/•’-a, - 0 -pm'"- (i) 



224 


BURTON V. DEAN 


This probability pfj is bounded: 0 </>£•< 1. It is a measure of the subjective 
uncertainty of the researcher of going from knowledge state i to knowledge state 
j or its alternative under the condition that only one failure is allowed. 

If two opportunities of failure are allowed, and under the condition that 
only one alternative is defined at each failure, then 

P% =Pij + 0 -PiiMpW + (1 -pV)qm\ (2) 

where is the probability of being able to define the alternative that would 
be required at the occurrence of the second failure. Equation 2 may be general- 
ized to consider the case of n opportunities of failure, in which at each failure 
only one alternative is defined. In this case 

#5 -Pn + (i -PiMpW + (i -dW + (i -Ml’)#' 

-pfr n +(i-pfr')- ffr"pn- ( 3 ) 

Thus by the use of formulas such as (1)— (3) it is possible to calculate a p*, 
the probability of acquiring knowledge state; or its equivalent. Because prob- 
ability is conditional on being in state; for a chain of two serial research tasks 
(/,/) and (;, k)> the probability of completing the chain p c is 

pc^piypjic- ( 4 ) 

Generally, the probability of completing a chain of n links is 

pc= ri pi j, (5) 

where S C( is the set of links in the c% th chain. The probability p C{ can now be 
computed for all chains in the network. The probability of completing the 
network p is 

p=npc,, (6) 

c < 

where the product is taken over all chains in the network. 


3. RESEARCH PHASES 

Consider a set of systems Si, S % , . . . , S n that is being investigated in a research 
organization. For each system we have that 

C(j =;th component in the tth system ; i = 1 , 2, ...» n, 

;* = I» 2, • • . , tii • 

For each system to be developed we require that all corresponding components 
are to be developed. 

For each component C< we suppose that there is a set of technical alterna- 
tives, 

Tijk — Ath technical alternative leading to the development of component Cy; 

* = 1,2 



STOCHASTIC NETWORKS AND RESEARCH PLANNING 


225 


We consider that any of the technical alternatives, if successful, would be 
sufficient to accomplish the objective of developing the corresponding com- 
ponent. Each technical alternative is composed of a number of research tasks, 
established as intermediate and final goals of the project. 

The aggregation of the three sets {Sf} r {(?*/}, and {7^*-} forms the three 
research pluses. 

We consider four stochastic decision problems in the following sections: 

L How should research tasks be sequenced to minimize the expected cost 
of a research study (Model I)? 

2. How should technical alternatives be selected to maximize the prob- 
ability of achieving a component concept (Model II)? 

3. How should component concepts be funded to maximize the prob- 
ability of achieving a system (Model III)? 

4. How should funds be allocated across systems to maximize total %*alue 
(Model IV)? 

Models II— IV arc solved sequentially, and the results in the form of a relation 
between cost and probability of success for one model are used as an input to a 
succeeding model. 

The application of operations research is to develop decision models for and 
solutions to the problems described, which are parameterized by the dollar 
amounts to be made available. In this way it is possible to construct a cost- 
cfTectiveness approach to research planning under uncertainty that presents 
a relation between the amount allocated to R and D and the maximum expected 
value to be achieved. 

The basic method used is dynamic programming, which is a mathematical 
method for determining the optimal allocation of resources to activities to best 
achieve an organizational objective; for example, if an organization is engaged 
in a number of tasks and can estimate the expected costs and benefits of each 
task, dynamic programming may be used to find the optimal amount to allocate 
to each task to maximize the performance of the organization. In nontechnical 
terms, this is accomplished by solving a number of one-dimensional decision 
problems instead of the multidimensional allocation problem originally formu- 
lated. Because planning involves the allocation of resources to activities, it may 
be observed that dynamic programming is a useful planning took In particular, 
the measures of performance of alternate plans that correspond to different 
budget levels arc obtained as a direct result of the dynamic programming 
solution. Because dynamic programming is an iterative method, it is capable of 
being easily converted into a computer program. 


4. LOW-COST-HIGH -HISK SEQUENCING 

Research tasks arc characterized by a high degree of uncertainty. An important 
decision that must be made in the conduct of a research study is selecting the 
sequence in which the tasks are to be performed. We consider, in this ease, that 
n number of independent tasks are to be performed in which the study will be 



226 


BURTON V. DEAN 


successful if and only if all the tasks are successful. The problem is to deter 
mine the optimal sequence of performing the tasks that minimize the total 
expected cost of performing the study. 

Example , Consider the following example of two tasks T\ and T>, zdth 
associated probabilities of success pi and p% and costs C\ and Co. 

Probability 

°f 

Task Success Cost 


T x 0.50 S10.000 

T 2 0.25 $20,000 


The problem is to select the sequence in which the tasks are to be performed. In 
the event that the sequence T X T 2 is selected the expected cost will be 

$10,000 + 0.50($20,000) = $20,000. 

In the event that the sequence T 2 T x is selected the expected cost will be 

$20,000 + 0.25($10,000) = $22,500. 


Hence the optimal task sequence is T X T 2 . In general the expected cost of each 
of the two alternatives may be calculated as follows: 


E(T X T 2 ) = C X + P X C 2 , 

E(T 2 T X ) = C 2 + P 2 C X . 

Now E{T x T 2 ) < E{T 2 Ti) if and only if C x + PiC 2 < C 2 + P 2 C x or 

C X -P 2 C i<C 2 -PiC 2 , 


Ci(1-P 2 )<C 2 (1-P 1 ), 

c 1 < c 1 

Pi P 2 ’ 


where R x — \ — P x and R 2 = \ — P 2 • Thus the optimal sequence of two tasks ts 
to rank order on the basis of cost-risk ratios and to select as the initial task that 
having the lowest cost-risk ratio. In this example we have 


and 


Ci _ $10,000 
Pl~ 0.50 


$20,000 


C 2 $20,000 
R~2 = 0.75 


$26,667. 


In the general case of n independent research tasks, T x , T 2 , T n , the rank 
ordering of 

Cn 


Ci C 2 



STOCHASTIC NETWORK* ANT) RESEARCH PLANNING 


227 


would produce the optimal sequence . Any sequence that is not so ordered could he 
reduced in total expected cost by interchanging adjacent tasks not satisfying the 
inequality on cost -risk ratios. 

AVer, in the case in which the n independent research tasks have approximately 
the same costs it may be observed that the optimal sequence is one in which 

R 2 < R n ’ 

that is, zee perform the high-risk tasks first. If, haze ever , the tasks have approxi- 
mately the same risk , the optimal sequence is that obtained by rank ordering the 
costs 

Cl<C 2 < -<Cn 

and performing the lore-cost tasks first. In the genera! case , in which costs and risks 
may be different , an optima! sequence of independent research tasks is that obtained 
by rank ordering the tasks on the basis of the lore -cost -high-risk ratios . 


5. OPTIMAL SELECTION OF TECHNICAL 
ALTERNATIVES (MODEL II) 


Suppose that i — 1 , 2, . n represents n independent technical alternatives 
that arc proposed to accomplish a specified R and D task leading to the develop- 
ment of a component. The problem is to find the set of technical alternatives 
to be investigated that would maximize the probability of achieving a com- 
ponent goal for a specified total cost. Let 

P( = probability that the /th technical alternative is successful if it is 
selected for investigation, 

Cr “the expected cost of performing the /th alternative, if it is selected. 


Suppose that a total amount ?C is available for accomplishing the com- 
ponent development task. Define the following functions. 


fi(C) = maximum probability of accomplishing the task for an optimal 
set of alternative approaches, i = 1, 2, ..., n. 


Then 


h (C) = 


IP, if Cj < C, 
|0 if Ci > C, 


|m«[l-{l~ft}{l~/i(C-C.)j./i(C)]. if C 2 S:0, 
h{C) ~\/i(C) if C 2 >C, 

ffr , fnm[l ~ {l —P R }{1 — /r.-i(C — C n )},/ n _i(C)], if C B ^C, 
ML) ' U-i(C) if C„ > C. 

It may be observed that the dynamic programming procedure provides the means 
of determining the solution for any specified budget values, for the optimization 
is carried out for incremental values of C-~0 t h, 2 h t up to any desired 

lev cl, say the total con of performing all technical alternatives. 



228 


BURTON V. DEAN 


We now consider, as an illustration, the problem of selecting from four 
alternatives, as presented in Table 1: 

Table 1 

Example of Four Technical Alternatives — 

Probabilities and Costs 


Technical Probability 

Alternatives of Success Cost 



/ 4 (160i£) = 1 — {0.3} {0.7} {0.5} {0.6} =0.937. The total cost if all alternatives 
are followed is $160,000, and if this amount is allocated to the four alternatives 
the probability of achieving the component is 0.937. 

Table 2 

Dynamic Programming Calculations of Technical Alternative Selection 















STOCHASTIC NETWORKS AND RESEARCH PLANNING 


229 


The dynamic programming calculations may be presented in the form of 
Table 2. The left column considers the total cost to be expended as variable, 
and in this example intervals of $ 2 0,009 are used from 0 to £169,099. The 
next two columns present an analysis of alternative 1, in which the function 
/j(C) is u c cd. When C < §50,000 alternative 1 cannot he followed; a 0 is entered 
in the column labeled 1 ; also, in this case,/i(C) — 0. In the event, however, that 
C ;> $50,000, alternative 1 can be followed; a 1 is entered in the column labeled 
1; also f\{C) — 0.7, since the value of P\ is 0.7, 

Now consider alternative 2. In the event that C < §20,009, alternative 2 
cannot be followed; a 0 is entered in the columns labeled 2 and /e{C). In the 
c^ent that C = §20,009, §30,000, or $40,999, alternative 2 can be followed, and 
we have, for example, 

/ 2 (40) = max[l - {1 - 0J}{! -/i(20}},/ 1 (40)] 

= max[l — (9.7)(1), 0J 
= 0.3, 

Suppose that C = §50,009. 

MS 0)*ma [1 - {1 — 0.3}{1 -/i(30)} t /a(S0)] 

* max [1- (0.7X1), 0.7] 

= 0.7. 

We enter a 0 in the column labeled 2, for it is best not to select alternative 2 
but alternative 1 instead. In this case / 2 {50) =0.7 = 

Suppose that C = $70,000. 

/s(70) = max [! - {1 -0.3}{1 -/i(50)}/,(70)J 
= max [1 — (0.7)(0.3), 0.7j 
= 0.79. 

Wc enter a 1 in die column labeled 2. 

The corresponding columns for alternatives 3 and 4 arc completed in the 
same way. Note that each column utilizes the calculations made for the pre- 
ceding column; for example, 

/j(130)=max [3 - (1 ~-0.4}{l ~/ 2 {190)},/ 2 (130)] 

= max [1 — (9.6}{0.21), 0.895] 

= max (0.874, 0.895) 

= 0.895, 

and alternative 4 is not selected. 

The final column fi{C) prcs-cn is the maximum probability of achieving the 
component concept as a function of the total cost C. This cost -effectiveness 
relationship may be presented crapliically, 25 in Figure 6. Xotc that the effec- 
tiveness relationship has the general characteristic of being below a minimal 



230 


BURTON V. DEAN 



cost value and increasing rapidly for costs greater than this minimal value to a 
saturation value beyond which increasing the total cost has only a negligible 
influence on effectiveness. 

Each point on the cost-effectiveness curve corresponds to an optimal set of 
technical alternatives. The method for obtaining the set for each cost value is 
as follows: select a cost-effectiveness pair C, / 4 (C), as, for example, C = 130, 
^4 = 0.895, At C—130 alternative 4 is not selected. Proceed to column 3, 
corresponding to C = 130 (since alternative 4 has not been selected). We have 
that a 1 appears in the column labeled 3; alternative 3 is to be selected at a cost 
of C3 = 60. Proceed to column labeled 2 at a cost of C — C3 =s 130 — 60 — 70. 
We have that a 1 appears in the column labeled 2; alternative 2 is selected at a 
cost of C 2 = 20. Proceed to column labeled 1 at a cost of C — C3 — C 2 = 70 — 
20 = 50. We have that a 1 appears in the column labeled 1 and alternative 1 
is selected. In this way the optimal sets of technical alternatives are obtained for 
all values of the total cost C to be allocated to the set of alternatives. Thus the 
decision maker has the following two results: 

1. The maximum probability of achieving the component concept as a 
function of the total amount to be expended on the technical alternatives. 

2. The optimal set of technical alternatives as a function of the total amount 
to be expended. 

These results are used in the next model to determine the optimal funding levels 
of the component concepts. 

It may be noted that a simple ranking method is not possible for selecting 
an optimal set of technical alternatives. At C = S160,000 all alternatives are 




STOCHASTIC NETWORKS AND RESEARCH PLANNING 


231 


possible. At C = $140,000 and C~ $150,000 alternative 2 is rejected. At 
C = $100,000, $110,000, and $120,000, however, alternative 3 is rejected and 
alternative 2 is accepted. Table 3 presents the optimal set of alternatives for 
each total cost level. 


Table 3 

Relatioriship Between Total Cost C and Optimal Set of 
Technical Alternatives S o 


c 

160 

150 

140 

130 

120 

110 

Sc | 

1, 2, 3, 4 

1. 3, 4 

1, 3, 4 

1, 2, 3 

1, 2, 4 

1. 2, 4 


C 

100 


80 

70 

60 

50 

40 

30 

20 

10 

Sc 

1, 2, 4 

1, 4 

I. 2 

1 

1 

4 

4 

2 

0 

0 


Finally, it may be observed that this procedure may be followed on a manual 
basis if n < 20 and programmed for an electronic computer for values of n > 20. 
Furthermore, the method may be extended to consider costing a number of 
categories such as different program elements. 

6. OPTIMAL FUNDING OF COMPONENT 
CONCEPTS (MODEL III) 

Suppose that a set of m component concepts, j = 1, 2, ...» m t is associated with 
a system. Suppose that for each component concept we have that 

Pj(C/)=thc probability of achieving the jth concept if SC) is allocated 
to it, as obtained from Model II. 

We wish to find the optimal allocation of total funds $C across the component 
concepts to maximize the probability of accomplishing the system. We note that 
in order to achieve the system all component concepts must be fulfilled. 

We construct the mathematical model as follows. The probability of ac- 
complishing the system P(C), if SC is to be allocated to the system, is given by 

Pl{Ci)P«(C2)'"Pm(C m l ( 1 ) 

since the system is achieved if and only if all component concepts arc achieved. 

We wish to maximize I\C) subject to 

C— 2 C } . ( 2 ) 

J~ I 

We construct a dynamic programming formulation to the problem. Ixt 

Fi(C)-P,(C), 

F«(C)-= max [Pz{C z )Fi{C - CA], 

r (3) 


Frs(C) — max [/* n (C„)F n -,(C - C„)l 





232 


BURTON V. DEAN 


In m steps the allocation problem for Model III may - be solved by using 
dynamic programming solutions to single-variable optimization problems. I n ! 
tervals of, say, $10,000 ought to be selected in order to make the search in (3)' 
The format in Model II may be used. ' 

7. OPTIMAL SYSTEM FUNDING (MODEL IV) 

Suppose that a set of q systems is to be funded in which k = 1, 2, . . . , q. Suppose 
that with respect to each system we have a numerical value V k which expresses 
the priority of the system. Furthermore, if $C k is allocated to the kth system 
we shall have determined that 

Pk(Ck) = the probability of accomplishing the kth system within the specified 
time period. 

The values of P*(C*) are obtained from Model III. 

Consider that SC is allocated to the set of q systems. We wish to find the 
optimal allocation of SC across the systems to maximize the total expected 
military value. We wish to 

max 2 V k P k (C k ), (1) 

l 

subject to 

C = 2^ C k . (2) 

To solve this allocation problem we apply dynamic programming as follows: 
ZiCCJ-FiPiCC), 

/ 2 (C)= max [V 2 P 2 (C)+f 1 (C-C 2 )l 

0<Cz<C 

l (3) 

/<?(C) = max [V q P q (C q ) +f q -i(C - C q )]. 

o <c Q <c 

Thus this allocation problem may be solved by finding the solution to q one- 
dimensional optimization problems. The results are the optimal system funding 
levels and maximum value as functions of different funding levels. 

We may obtain an intuitive solution to the system funding problem, in the 
case of continuous P k functions, as follows: form the Lagrange function of 
(1) and (2) 

L(C) = 2 V k P k (C k ) + a(c- i cA ( 4 ) 

*=1 \ *=1 / 

Differentiating (4) with respect to C k , we have 

V k P' k (C k )- A = 0, * = 1,2 (5) 

The optimal allocation of funds across systems is such that the product of value 
and marginal increase in the probability of success is the same for all systems. 



STOCHASTIC NETWORKS AND RESEARCH PLANNING 


233 


This procedure may be applied manually in the event that a small number 
of systems is to be analyzed and by electronic computer, otherwise. The 
dynamic programming procedure may be extended to consider restrictions 
on the costing of different program elements. 

8. CONCLUSIONS 

In this study a research project is viewed as a process of acquiring knowledge 
sequentially, a process that is described by an ordered series-parallel system. 
This leads to the development of the stochastic network concept, stochastic 
because the nodes of the network represent the intermediate “ bits or pieces 
of knowledge needed to acquire the knowledge goal and are not known with 
certainty in advance. The stochastic network represents the researcher’s planned 
acquisition of knowledge. The change in this plan is represented by a con- 
figurational change in the research network. The causes of a change in plans arc 
not investigated in this study. Rather, the effects of change are important. 
Probability values that measure the researcher arc subjective estimates of com- 
pleting each link in the network, hence each chain. Research inputs in the form 
of time expended and man-hours allocated on tasks arc also estimated. 

This report presents a theoretical model for solving four decision problems 
in research planning: 

1. Sequencing of research tasks. 

2. Selection of technical alternatives. 

3. Funding of component concepts. 

4. Cost allocation across systems. 

The models arc scquc?itial in that the results of each decision problem are 
used in the next decision problem. The solutions are parameterized in that 
all results arc presented in cost-cffcctiveness form so that the decision maker 
can select the minimal cost level to achieve the objective of effectiveness. The 
solutions arc adaptive to changes in the values of the model parameters. 

To implement and make effective use of the models data on parameter values 
are required as inputs to the models. Only limited amounts of data arc required, 
however, for the initial decision problems. Only cost-risk ratios arc needed in the 
first problem. Problems 2 and 3 require estimates of the probability of success 
and costs. In addition, numerical value is required in problem 4. It may be 
observed that an estimate of numerical value is not necessary' to a solution of 
problems 1-3. 

The use of dynamic programming penults the performance of sensitivity 
analyses of solutions to changes in parameter values; for example, if a study 
is to be performed on the effect of changes in values of the cost parameter of a 
proposed technical alternative, the program may be operated so that this cost 
parameter can be varied. A computer program of this procedure would permit 
the rcscarchcr-planncr to investigate the effect of alternate plans on budgets 
and R and D performance. 

Solutions may be obtained manually if the number of alternatives is less than 
20, and otherwise may be programmed for an electronic computer. 



234 


BURTON V, DEAN 


9. REFERENCES 

[1] Richard Bellman, Dynamic Programming , Princeton University Press, 1957 

[2] Richard Bellman, Adaptive Control Processes , Princeton University Press 196] 

[3] Richard Bellman and Stuart Dreyfus, Computational Aspects of Dynamic 
Programming , Princeton University Press, 1961. 

[4] Richard Bellman and Stuart Dreyfus, Applied Dynamic Programming p r ; nr(1 

ton University Press, 1962. 6 v 

[5] C. G. Bigelow, “Bibliography on Project Planning and Conduct by Network 
Analysis, 1959-1961,” Operations Res., 10, 728-731 (1962). 

[6] Robert G. Busacker and Thomas L. Saaty, Finite Graphs and Networks: An 
Introduction with Applications , McGraw-Hill, New York, 1965. 

[7] Ronald A. Howard, Dynamic Programming and Markov Processes , Technology 
Press and Wiley, New York, 1966. 

[8] Delmar W. Karger and Robert G. Murdick, Managing Engineering and. Re- 
search, Industrial Press, New York, 1963. 

[9] J. F. Kelley, Jr., “Critical Path Planning and Scheduling, Mathematical Basis” 
Operations Res., 9, 296-320 (1961). 

[10] D. G. Malcolm, J. H. Roseboom, C. E. Clark, and W. Fazar, “Application of 
a Technique for Research and Development Program Evaluation,” Operations 
Res., 7 (1959). 

[11] H. Raiffa and R. Schlaifer, Applied Statistical Decision Theory, Harvard Uni- 
versity Press, Cambridge, Mass., 1961. 


RESEAUX STOCHASTIQUES ET PLANIFICATION 
DE LA RECHERCHE 


RESUME 

Cette etude traite des developpements et des possibilites duplication d’une 
methode pour le planning des projets de recherche. La methode du chemin 
critique (C.P.M.) et la methode PERT ont ete utilisees pour le planning et le 
controle des projets de recherche et de mise au point, Cependant ces methodes 
ne sont pas adequates pour le planning d’activites comportent Tacquisition de 
connaissances nouvelles dans la fa£on de conduire un projet de recherche a 
cause de la part Pincertitude qu’ils comportent. 

Un projet de recherche est considere comme une acquisition sequentielle 
de connaissances et peut etre schematise par un graphe d’arcs orientes en serie 
et en parallele. 

Les evenements (noeuds) sont les niveaux de connaissance et les activites 
(les arcs) sont les operations qui sont prevues pour atteindre chaque niveau de 
connaissance. 

Un graphe stochastique represente done sous la forme d’un reseau le plan 
d’etude du chercheur, pour accomplir les phases de la recherche et arriver a 
chaque niveau de connaissance. 



THEORY OF GRAPHS 


235 

Lc graphc cst stochastique du fait qu’a la fois tachcs ct cvencmcnts sont dc 
nature probahiiistc. Lcs parametres associes a cc graphc — probability dc 
rdussitc, cout estime, csperancc mathematique dc la durcc dc lY-tiidc— sont 
cvalucs subjcctivcmcnt cn utilisant lcs estimations periodiques du chcrchcur. 
La structure du rcscau peut egalemcnt se modifier pendant I ’execution dc la 
recherche. 

Lcs modclcs stochastiqucs dc prise dc decision ont etc developpes pour 
determiner Ies sequences d’operations a cfTcctuer, la selection dcs alternatives 
entre operations et lc niveau de cout dc chaquc operation. Lcs critcres dc 
decision comprenncnt la minimisation dc la valcur moyenne dcs couts a cncourir 
ct la maximisation dc la probability dc rcussitc. 


Algebraic Determination of Loops and Paths in Graphs 
Determination Algchriquc dcs Boucles ct Chcmitis dcs G raphes 


S. Okada 

The MITRE Corporation , 

Bedford , Massachusetts , United States of America 


By using node-are incidence matrix of a graph, a linear algebraic method of 
determining all loops, including two methods of eliminating multitoop vectors, 
is given as a preliminary to establishing an algorithm. Other possible methods 
based on von Schouten's combinatorial solution arc added. After describing a 
few relationships between loops and paths, a new simple algorithm of deter- 
mining all loops or paths is elaborated and exemplified. It is easily applicable 
to large graphs by subdivision. The key to this algorithm is the introduction 
of a new node-are incidence matrix expression of any subgraph. Generalization 
of the algebraic method to supply-demand problems is briefly sketched. 


Au moycn dc la matricc d’incidencc d'un graphc, on decrit tine methodc 
algchriquc pour determiner tous lcs cycles d’un graphc, cn climinant auto- 
matiquement lcs cycles composes. On examine d’autres rnethodcs possibles 
basdes sur la solution combinatoirc dc von Schoutcn. 

Apres avoir demontre quclqucs relations fondamcntalcs entre cycles ct 
chatncs, un nottvcl algorithmc, plus simple, pour la determination dc tous lcs 
cycles ct lcs chaincs cst dlaborc. 11 peut etre applique a dcs graphes dc grandcs 
dimensions par sotis-divisions. Unc generalisation dc ccttc methodc algchriquc 
au problcmc dc I'offrc et dc la demande est brievement dccritc. 



236 


ABSTRACTS 


Pour une Bonne Comprehension du Nombre Chromatique 
par la Theorie des Families de Fonctions Booleennes 


Toward a Better Understanding of the Chromatic Number 
by Means of the Theory of Families of Boolean Functions 


C. Benzaken 
St. Martin-d'Heres , France 


La theorie des families de fonctions booleennes congue en vue de problemes de 
synthese, trouve une application quelque peu inattendue concemant les nombres 
chromatiques des graphes. 

Dans Pensemble £1 des fonctions booleennes finies, on definit deux operations 
elementarres : 

(1) La reduction d’une fonction, (2) La composition de deux fonctions. 

On designe sous le nom de famille, toute parti (cQ) fermee par rapport 
a ces operations. 

Toutes les families ont ete exhibees. Parmi el les, existe une chaine que 
nous pouvons ecrire: 

MSz MSz • • # MSi z> MSi+x * • - zd MS . » . 

A tout graphe T symetrique (defini a un isomorphisme pres) on peut faire 
correspondre une fonction booleenne y appartenant a MSz. Cette correspon- 
dance est injective mais non surjective. Si i est le nombre chromatique de T, 
cette fonction y appartient a MSi et non a MS^. 

Eu egard aux operations elementaires grace auxquelles on a pu construire 
ces families, il est naturel d’etendre la notion de graphe a celle de polygraphe 
pour qu’elle devienne en quelque sorte “ isomorphe ” a cette chaine MSi. 

Peut-etre, cela permet-il une meilleure comprehension du nombre chro- 
matique. 


The theory of families of boolean functions conceived for the synthesis problem 
provides a slightly unexpected application of the chromatic numbers of graphs. 

In the set Q of all finite Boolean functions we define two elementary oper- 
ations : 

(1) The reduction of one function. (2) The composition of two functions. 

By a family we mean any set IF (c: £2) closed under these operations. All 
the families have been exhibited that contain a chain that we can write as 

MS Z => MSz— => MSi => MS*. 



TKECm OF GRAPHS 


237 


A boolean function y belonging to A/52 corresponds to any £t\en symmetric 
graph P (up to an isomorphism) This correspondence is one to one but rot 
onto If i is the chromatic number of F, y belongs to MSf but not to MS(-i. 

Considering the elementarv operations that hate allowed the construction 
of these families, it is natural to evtend the graph notion to the poh graph notion 
that becomes in some \^a\ 44 isomorphic” to this chain A/5*. 

Perhaps this will lead to a better understanding of the chromatic number. 



SESSION B 


MARKETING 


Problerr.es Commerciaux 


Chairman: J. D. C. Little (Lnited States of America) 




DEVELOPMENT, VALIDATION, AND IMPLEMENTATION 
OF COMPUTERIZED MICROANALYTIC 
SIMULATIONS OF MARKET BEHAVIORf 

Developpement , I Validation, ct Mise en Execution de 
Simulations Micro-analytiqucs du Comportement 
d'un Marche a V Aide d'un Ordinaieur 

Arnoi d E. Amstutz 
Sloan School of Management 

Massachusetts Institute of Technology , Cambridge , Massachusetts 
United States of America 


Since 1959 I have been \\ orbing with cooperating managements to develop, 
validate, and implement microanalytic behavioral simulations designed to aid 
formulation and evaluation of marketing policies and strategies [1]. This work 
has focused on a particular class of decision situation in which the simulation 
approach has shown unusual promise. These situations are characterized by 
two common elements. First, outcomes are largely determined by complex 
human behavior and second, management must influence actions by persua- 
sion in order to achieve desired results — management is not able to exercise 
direct control. 

The planning and implementation of marketing programs involve the co- 
ordination of many types of management activity designed to persuade the 
prospective customer to take actions or develop attitudes and beliefs favorable 
to the company’s brands. Market-oriented simulation systems focus on the 
processes through which management attempts to influence market behavior. 
The models on which such simulations are based encompass microanalytic 
representations of retailer, distributor, salesman, and consumer and industrial- 
purchaser behavior in the environment external to the firm. 

The objective of this article is to describe the process followed in develop- 
ing, validating, and implementing a market-oriented microanalytic computer 
simulation [2J. This process is illustrated by examples drawn from three 
rcpcscntativc simulation-based management systems. 


1. THE SYSTEM DEVELOPMENT PROCESS 

When developing a microanalytic computer simulation of market behavior, 
the firm and its competitors are viewed as input generators. The external 


t A portion of this research was done a: Project MAC and the Computation Center 
ar the Massachusetts Institute of Techno! o^y, Cambridge. 

241 



242 


ARNOLD E. AMSTUTZ 


market simulation is then designed to duplicate response characteristics of 
a comparable real-world market to inputs of the type generated by the firm 
and other market sectors. 

The process followed when working with management to develop a micro- 
analytic simulation of a company’s markets is illustrated by reviewing the 
steps taken by one management using this technique. 

Boundary Definition 

The first step in simulation is to establish boundary definitions that will 
determine the detail and scope of the system to be developed. In most 
instances this preliminary “ macro specification ” is relatively crude. Manage- 
ment is encouraged to define a limited number of basic system elements 
Relationships between these elements are then summarized graphically, as 
illustrated in Figure 1. This figure, developed in the course of discussions 
with a food-product manufacturer, indicates management’s concern \uth 
interactions among manufacturers, distributors, retailers, and consumers in 
a perishable food-product market. 

Although Figure 1 may appear overly simplified, it is an important first 
step in the structuring process. Realistically complex representations evolve 
by gradual refinement of initially simple structures. 

Objective Formulation 

In the course of boundary definition management also specifies the 
objectives they hope to achieve by the use of the system once it has been 



Figure 1 . A first-stage macro description of market interactions. 







COMPUTERIZED MICROANALVTIC SIMULATION'S OF MARKET BEHAVIOR 243 


developed, validated and implemented* Objectives explicate criteria of 
relevancy that determine whether a particular aspect of the environment will 
he included in or excluded from the system. Objectives also indicate the level 
of system detail and accuracy required by management. 

Description of Macro Behavior 

Once the desired system scope and objectives to be achieved by system 
use have been specified the structuring process continues. Each sequential 
step involves increasingly micro (detailed) description of behavior within the 
environment to be simulated. 

Figure 2 illustrates a later stage in management progress toward system 
specification. Concepts illustrated in Figure 1 have been expanded by recog- 
nition of government and salesmen. The description of interactions has become 




















o 

o 

O) 

«/l 


o 

"3 

-Q 


cn 

Q 


f 


244 


Distributor sector 
























Manufacturer sector 1 r Retail sector 



246 











Figure 4. Decision nnd response function specification for three market sectors. 













248 


ARNOLD E. AMSTUTZ 


more explicit. Information flow is now differentiated from product flow 
(In later stages additional recognition was given to the flow of capital ) At 
this level of specification discussion is focused on bilateral channels relatin 
the manufacturer and his competitors to distributors and retailers through 
their respective salesmen. Media promotion is represented by unilateral com- 
munication channels from the manufacturer to the consumer directly and 
through trade channels. 

Once major interactions have been identified, attention is focused on 
processes associated with each interaction, taking account of backlogs 
delays, and transfer points at which the rate of product, information, or value 
flow may be measured. Figure 3 illustrates this development for the channel 
of product flow represented by solid lines in Figure 2. 

After interactions between sectors have been described major decision 
points within each sector are identified. When examining the manufacturer 
sector, the objective is to identify major decisions affecting the generation of 
inputs to the external environment [3]. Examination of other sectors is 
directed toward decisions affected by manufacturer generated inputs. Figure 4 
illustrates one such description of decision and response factors within the 
manufacturer, retailer, and consumer sectors [4]. 

Description of Decision Processes 

Once key decision and response elements have been identified, the focus 
of model development shifts from description of relationships to formulation 
of behavioral theory. Each decision point is described in terms of inputs to 
and outputs from that decision. Hypothesized relationships between inputs 
and observable behavior are formulated in terms of measurements that permit 
validation of the model against data from the real world. This process is 
illustrated with reference to the “decision to shop” noted in the consumer 
sector of Figure 4. 

The conceptual framework summarized in Figure 4 hypothesizes an 
explicit consumer decision to go to a store to seek a particular brand or infor- 
mation about that brand. This decision structure implies that consumers 
entering a store with an explicit intention to investigate or acquire a particular 
brand exhibit behavior significantly different from that of consumers who 
accidentally encounter a brand or product line in the course of broader 
shopping (search) activity. 

The Perceived Need Concept — An Example 

Evaluation of management hypotheses regarding the decision to shop led 
to a qualitative concept of “ perceived need.” This concept, which was later 
quantified in terms of a measure of expressed intention to buy, might be 
viewed as an extension of utility theory. When formulating this model, 
management proposed that the consumer’s motivation to take action to acquire 
a particular brand is related to his perceived need for that brand, which in- 
creases with (a) positive attitude toward the brand, (b) opportunity for bran 
use, and (c) time since purchase. This qualitative concept was later refine 
to the series of relationships illustrated in Figures 6, 7, and 8. 



COMPUTERIZED MICROANALYTIC SIMULATIONS OF MARKET BEHAVIOR 249 


The Effect of Altitude 

Using a modified Osgood scale consumer 
orientation (altitude) toward a brand is measured 
by asking a respondent to rate the brand on an 
eleven point scale from —5 (strongly favor) 
through 0 (indifferent) to —5 (strongly dislike) 

[5]. The observed reJationship between attitude 
(measured with the scale shown in Figure 5) and 
“perceived need” is illustrated in Figure 6. 

Use Opportunity 

Use opportunity is measured in terms of the 
number of times that the consumer had an oppor- 
tunity to use a brand within the product class 
being studied during the preceding quarter. This 
information is obtained by direct interview as well as diary maintenance. As 
illustrated in Figure 7, a linear association was established between the use- 
opportunity and pcrccivcd-need measures. 

Time Since Purchase 

The time since purchase is measured, as the name suggests, by determin- 
ing the time (in weeks or average product life) since the consumer last pur- 
chased a brand in the product class being studied. Figure S illustrates the 
general form of this relationship expressed in multiples of average product 
life for the current perishable food-product example. 

Income Stratification 

Initial attempts to validate the perceived -need construct produced 
evidence that the relationship between the three pcrceivcd-nced measures 
and actual shopping behavior is income-dependent. Further investigation 
revealed that behavior could be differentiated by population subsegments 
established on the basis of income-level stratification, as illustrated in 
Figure 9. 

Probabi lily -of - Shopping Function 

Combining the three elements of perceived need with income stratification 
produced a function of the type illustrated in Figure 10 relating the 
probability of shopping for the food product to perceived need and income. 
Tins figure specifies the perceived -need-based function for each of the income 
levels stratified in Figure 9. 

Additional Function Formulation 

In a similar manner each decision and response function defined by macro 
specifications is investigated. In some instances initial theoretical constructs 
arc validated. In others empirical evidence suggesting alternative constructs 
is obtained and the process of formulation is repeated for revised structures. 


+ 5 
+ 4 
+ 3 
+ 2 
+ 3 
0 
-I 
-2 
_2 
-4 
-5 


~1 

' 



/ UVe 

1 

r~ ) 

J 


) Drshb 




led it fe rent 


Figure 5. 

The attitude scale. 










COMPUTERIZED MICRO ANALYTIC SIMULATIONS OF MARKET BEHAVIOR 


251 



i t ) i 1 

0.10 0.50 1.00 

Perceived need 

Figure 10. Probability of shopping ns function of perceived need and income. 


Explicit Decision Representation 

Decision and response functions are formulated and tested as probabilities, 
since referenced data are in the form of frequency distributions. Generation 
of explicit decision outputs for each cell within a simulated population re- 
quires conversion of the probabilistic statement into explicit ves/no decisions, 
A number drawn randomly from a rectangular distribution of range 0 to 1.0 
is compared with the stated probability to determine the occurrence of the 
probabilistic event [6]. 


2. BEHAVIOR OF AN ARTIFICIAL POPULATION 

Once validated at the function level, decisions and response formulations of 
the type just described arc combined in a simulation structure. Operating 
within the framework supplied by the simulation system, these functions 
determine the actions and responses of simulated population members. 

This stage in the simulation process is illustrated by using output from an 
appliance-market simulation. This system of models was developed following 
procedures comparable to those just described for the food product. Similar 
concepts and measures as well as parallel model structures arc evident, 

A Week in the Life of a Simulated Consumer 

Figure 11 was obtained by monitoring the “thoughts and actions *’ of one 
member of a simulated appliance-market population during a simulated week 
in which the population experienced events comparable to those encountered 
by a real-world population during the week beginning February 19, 1962. 

Identifying Characteristics 

The information provided, beginning with the third line of output in 
Figure 11, identifies characteristic attributes of consumer 109. I!c is a subur- 
ban (SU) resident of New England (NE) between 25 and 35 years of age, 
with an income between $8,000 and $10,000 a year, and has a college 
education. He now owns a brand 3 appliance purchased six years ago. 




SIMULATION APP-03 TEST RUN APRIL 4, 1965 1400 HOURS 
— CONSUMER 0109 NOVI BEGINNING WEEK 117 — FEBRUARY 19, 1962 


- REPORT MONITOR SPEC 

1 FI ED. 

TO CANCEL 

PUSH 

INTERRUPT 


- CHARAC - REGION 

NE 

su. 

AGE 

25-35 

, INCOME 8 

-1QK, 

EOUCAT ION 

- BRANDS OWN 3, 6 

YEARS 

OLD. 

RETAILER 

PREFERENCE 

05, 

U , 03 

- MEDIA AVAILABLE 

1 0 

0 

1 0 

0 0 0 

1 1 

110 

0 0 0 

0 0 

0 0 0 

- ATTITUDES . 1 

2 

3 

4 

5 

6 

7 8 

9 

10 

11. .12 

PROO CHAR ! 0 

+ 1 

+ 1 

0 

-3 

-1 

0 45 

0 

43 

0 0 

APPEALS . -3 

0 

41 

45 

0 

-3 

+ 3 0 

0 

0 

+5 0 

BRANDS . *2 

♦ 1 

43 

42 







RETAILERS . +1 

-5 

43 

4 1 

45 

-5 

-5 41 

-1 

-3 

45 41 

. -3 

+ 1 

-1 

43 

41 

41 





AWARENESS , X 

0 

0 

0 








- MEMORY DUMP FOLLOWS, BRANDS LISTED IN DESCENDING ORDER 1 TO 4 
PRODUCT CHARACTERISTIC MEMORY APPEALS MEMORY 

1 2 3 4 5 6 7 8 9 10 11 12 ! i 2 3 4 5 6 7 8 ' 9 10 11 12 


2 

3 

15 

0 

5 

5 4 14 

8 7 1 

• 

3 . 

8 

9 

7 

3 

1 

11 

7 

4 

4 

3 

9 

3 

8 

0 

*6 

4 

9 

5 4 13 

0 3 6 

7 . 

6 

8 

0 

7 

0 

9 

2 

4 

3 

10 

3 

1 

0 

G 

15 

7 

0 

3 11 3 

5 2 5 

7 . 

0 

4 

8 

10 

9 

2 

14 

3 

9 

7 

9 

5 

7 

9 

3 7 

• MEDIA 

3 2 7 2 

EXPOSURE 1, 

6 12 14 

N! HATED 

7 . 

0 

9 

7 

8 

13 

9 

11 

6 

0 

2 

5 

9 


- MEDIUM 003 APPEARS IN WEEK 117 — NO EXPOSURES 

- MEDIUM 004 APPEARS IN WEEK 117 

- EXPOSURE TO AO 013, BRAND 3 — NO NOTING 

- EXPOSURE TO AD 019, BRAND 4 

- AD 019, BRAND 4 NOTED. CONTENT FOLLOWS 

- PROD. C 11 P-4, 4 P-2', 

- APPEALS 5 P-2, 7 P-2, 12 P-2, 

- MEDIUM 007 APPEARS IN WEEK 117 — HO EXPOSURES 

- MEDIUM 012 APPEARS IN WEEK 117 

- EXPOSURE TO AD 007, BRAND 2 

- AD 007, BRAND 2 NOTED. CONTENT FOLLOWS 

- PROD. C 8 P-3, 12 P-1, 

- APPEALS 2 P-1, 4 P-1, 6 P-1, 10 P-1, 

- EXPOSURE TO AO 013, BRAND 3 -- NO NOTING 

- EXPOSURE TO AO 004, BRAND 1 — NO NOTING 

- MEDIUM 016 APPEARS IN WEEK 117 — NO EXPOSURES 

- MEDIUM 023 APPEARS IN WEEK 117 — NO EXPOSURES 

- WORD OF MOUTH EXPOSURE INITIATED 

- EXPOSURE TO CONSUMER 0093 — NO NOTING 

- EXPOSURE TO CONSUMER 0104 — NO NOTING 

- EXPOSURE TO CONSUMER 0117 — HO NOTING 

- HO PRODUCT USE IN WEEK 117 

- DECISION TO SHOP POSITIVE — BRAND 3 HIGH PERCEIVED NEED 

— RETAILER 0 5 CHOSEN 


Figure 11. 

252 



COMPUTFRIZFD MICROAXALYTIC SIMULATIONS OF MARKET BEHAVIOR 253 


- SHOPPING INITIATED 

• CONSUMER DECISION EXPLICIT FOR BRAND 3 — NO SEARCH 

- PRODUCT EXPOSURE FOR BRAND 3 

- EXPOSURE TO POINT OF SALE 006 FOR BRAND 3 

- POS 008, BRAND 3 NOTED, CONTENT FOLLOWS 

- PROD. C 3 P-L, 6 P-L, 

- APPEALS 5 P-2 # 7 P-2, 10 P-2, 11 P-2, 

- NO SELLING EFFORT EXPOSURE IN RETAILER 05 

- DECISION TO PURCHASE POSITIVE -- BRAND 3, * 36.50 

- DELIVERY U'EDAT 

- OWNERSHIP - 3, AWARENESS WAS 2, NOW 3 

- WORD OF MOUTH GENERATION INITIATED 

- CONTENT GENERATED, BRAND 3 

- PROD. C 3 P- -15, 8 P-*1S, 

- APPEALS U P- *50, XX P-*U5 

- FORGETTING INITIATED — NO FORGETTING D 

~ CONSUMER 0109 NOW CONCLUDING WEEK 117 — FEBRUARY 25, 1562 

-- CONSUMER 0110 NOW BEGINNING WEEK 1X7 — FEBRUARY 19, 1962 

0 


QUIT, 

ft 11. 633^,750 

Figure 1 1 (conttmicd) 

Consumer 109 presently favors retailers 5, 11, and 3, in that order. He 
subscribes to or otherwise has available media of types ], 4, 9, 10, II, and 12. 
Media of types 2, 3, 5, 6, 7, S r and 13 through 24 arc not available to him. 

Consumer 109’s attitudes are summarized in a matrix beginning on line 6 
of Figure 11. This matrix indicates his orientation toward 12 product 
characteristics, 12 appeals, 4 brands, and IS retailers. From these figures it 
may be established that the mon important (highest attitude) product char- 
acteristic insofar as consumer 109 is concerned is characteristic 8, which he 
regards very highly (+5). Appeals 11 and 4 are similarly indicated as of 
primary' importance to this artificial consumer. From the retailer attitude 
portion of this matrix his preference for retailers 11 and 5 (both --5 attitudes) 
and 3 or 16 (both 4-3 attitudes) may be established. The final entry in the 
orientation matrix indicates that consumer 109 is aware of brand 1 [7], 

Consumer Memory Content 

Tlic line stating “MEMORY DUMP FOLLOWS. BRANDS LISTED 
IX DESCENDING ORDER 1 THROUGH 4“ introduces the print-out 
of consumer 109’s present simulated memory' content. This memory dump 
is a record of noted communications retained by the consumer relating specific 
product characteristics and appeals to each of four brands. From this report 
it can he established, for example, that consumer 109 has retained 14 com- 
munication exposures associating product characteristic S with brand 1, 13 
exposures relating product characteristic S with brand 2, and 14 exposures 
as>octating appeal 7 with brand 3. 



254 ARNOLD E, AMSTUTZ 

Media Exposure and Response 

The entry in the report following the memory dump indicates that tt* 
segment of the simulation representing media exposure processes has bssn 
entered. Six media appear (are published or broadcast) during week 117 
Consumer 109 is not exposed to medium 3, for that medium is not avaikh 1 * 
to him (see media availability indicator in the characteristic output). Median 
4 also appears in week 117, and because it is available to consumer 109 he roav 
be exposed to relevant ads appearing in it. The output indicates that he is ex- 
posed to an advertisement for brand 3 but does not note that communication. 
On the other hand an advertisement for brand 4 also present in medium 4 
during week 117 is noted as indicated bv the line reading, ADVERTISE- 
MENT 19, BILAND 4 NOTED. CONTENT FOLLOWS. The output 
message then indicates that advertisement 19 contains a high prominence 
(4) [8] reference to product characteristic 11 and a medium prominence (2) 
reference to characteristic 4. Advertisement 19 also contains medium promi- 
nence references to appeals 5, 7, and 12. 

Consumer 109 does not see medium 7, although it appears in week 117. 
However, he is exposed to three advertisements in medium 12, which also 
appears during that week. The advertisement for brand 2 is noted, whereas 
those for brands 3 and 1 are not. Media 16 and 23 also appear in week 117 
but are not seen by consumer 109. 

Word- of -M outh Exposure 

Report entries following the media exposure section indicate that con- 
sumer 109 is exposed to word-of-mouth comment generated by consumers 
93, 104, and 117, but fails to note communication from any of these indivi- 
duals. Had noting occurred, a message-content report comparable to that 
generated for advertising would have specified the information noted. 

Product Experience 

Consumer 109 did not have product experience during week 117. Had 
he made use of the product, a report of his response to product use indicating 
product characteristics or appeals, if any, emphasized by the use experience 
would have been printed. 

Decision to Shop 

The next entry in the Figure 11 output indicates that consumer 109 has 
made an explicit decision to shop, that his highest perceived need is for brand 
3 and that his first choice retailer is 5. Simulation models representing 
in-store experience have been loaded. 

In-Store Experience 

The first entry' within the SHOPPING INITIATED section notes that 
the consumer is exhibiting behavior associated with the explicit-dedsiofl-to- 
shop option and is seeking brand 3 (there is therefore NO SEARCH 2Ctmt) 



COMPUTERIZED MICRO AN* A LYTIC SIMULATIONS OF MARKET BEHAVIOR 255 

—no opportunity for accidental exposure). Simulated retailer 5 is carrying 
brand 3; therefore consumer 109 finds the brand he is seeking (3). 

Retailer 5 has placed point-of-sale display material for brand 3. The con- 
sumer is exposed and notes its content, which emphasizes appeals 3 and 6 
and product characteristics 5, 7, 10, and 11 as attributes of brand 3. Retailer 
5*s simulated salesmen arc either not pushing brand 3 or are busy with other 
customers. In any event, consumer 109 is not exposed to selling effort while 
shopping in retailer outlet 5. 

Decision to Purchase 

The output statement DECISION TO PURCHASE POSITIVE — 
BRAND 03, $38.50, specifies that consumer 109 has made a decision to pur- 
chase brand 3 at a price of $38.50. The line following indicates that retailer 5 
can make immediate delivery of brand 3. 

Response to Purchase 

Since consumer 109 has now purchased brand 3, his awareness, which was 
favoring brand 2, is changed to favor brand 3. 

Word -of- Mo u th G cncra tion 

Since consumer 109 is now the proud owner of a brand 3 product, it is not 
surprising to find him initiating word-of-mouth comment regarding bis new 
purchase. The content of his communication regarding brand 3 emphasizes 
product characteristics 2 and 8 and appeals 4 and 11 — the appeals and 
product characteristics toward which he has the highest perceived brand 
image as indicated in the previous memory dump. 

Forgetting 

Consumer 109 did not lose anv of his existing memorv content during 
week 117. 

The final output line of Figure 11 indicates that consumer 109 has 
concluded week 117. 

Simulated Population Behavior 

The behavior of population groups within each simulation sector is de- 
scribed by accumulating simulated individual behavior. Population behavior 
may he summarized in terms of the proportion of purchases allocated to each 
brand (brand shares), changes in population attitude distributions toward 
brands or changes in the perceived brand images held by significant popu- 
lation segments. 

One Year in the Lives of Two Simulated Doctors 

Once the reasonableness of simulated behavior of the type outlined in 
Figure 11 has been established, the system may be used to produce behavior 
over time. This aspect of simulation testing is illustrated by output obtained 



Class share Class share 


256 


ARNOLD E. AMSTUTZ 


from a third system — a microanalytic simulation of the doctor population in 
the United States [9]. Figure 12 illustrates the cumulative prescriptions of 
10 drugs written by two general practitioners in the simulated medical en- 
vironment. These two doctors prescribed only one relevant drug during the 
first two weeks of simulated activity. As the simulated year progressed how- 
ever, they used other drugs in the set of 10, and by year end their cumulative 
prescription shares were 37.5, 28.3, 21.1, 5.5, and 5.0 for drugs 1-Y, 2-X 2-0 
1-0, and 1-+, as illustrated at the end of the time plot. 



12.0 24.0 36.0 48.0 


Simulation test run‘1961 time-path simulation for 1 through 10 



Total Population Behavior 

Output of the type illustrated in Figure 12 is used primarily to test system 
stability. Two simulated G.P.’s are no more representative than two real-world 
doctors. Meaningful tests of system response require examination of the 
behavior exhibited by major population segments. Figure 13 illustrates the 
weekly prescriptions of 10 drugs by 100 members of an artificial population 
segment during the simulated year 1961. These simulated drug-usage figures 
may be directly compared with data generated during a comparable period 
in a real-world test market. 



COMPUTERIZED MICROANALYTIC SIMULATIONS OT MARKET BEHAVIOR 257 
3. SYSTEM VALIDATION 

Once a system has been developed and tested to the point at which manage- 
ment is convinced of its viability, validation tests designed to determine the 
extent to which the simulation is an accurate representation of a real-world 
environment must be undertaken. Validation testing generally proceeds 
sequentially from function analysis through cell and population validation. 


Simulalion-ycar 1961 time-path simulation for 21 through 30 



Simulation-year 1961 time-path simulation for 21 through 30 



Function-Level Validation 

The first step in function validation is a sensitivity analysis that indicates 
the relative sensitivity of total system performance to various functions within 
the system structure. Sensitivity testing establishes priorities for functional 
validations, for it is most reasonable to expend effort in validating those 
functions on which system performance appears to be most dependent. 

In validating a functional relationship the generally followed procedure 
is to test the null hypothesis that observed relationships arc due to random 




258 


ARNOLD E. AMSTUTZ 


variation by the application of a chi-square test. Once the null hypothesis is 
rejected, usually at the 1 per cent level, the degree of correspondence between 
real-world data and theoretical function form is established by using standard 
curve-fitting techniques. 

Cell-Level Validation 

Validation at the cell level is to establish that the behavior of an individual 
within the simulated population cannot be differentiated from that of a similar 
member of the real-world population. First-level testing is of the type 
suggested by Turing [10]. Later tests are designed to ensure that the distri- 
bution of relevant parameter values (e.g., frequency of brand purchases and 
changes in attitude and knowledge) exhibited by the simulated and real-world 
consumers under comparable conditions are statistically indistinguishable. 

Population-Level Validation 

Tests focusing on the simulated population are designed to establish the 
degree of correspondence between behavior exhibited by members of the 
simulated population and that exhibited by members of the real-world 
population measured in terms of variables relevant to management. 

Reliability Testing 

In beginning population testing it is necessary to establish that model 
performance is relatively insensitive to different random-number seeds used 
on sequential system runs. As an example, in the simulation used to generate 
the output illustrated in Figure 13 terminal drug share deviation between 
runs is less than 1 per cent. 

Performance Validation 

The acid test of simulation validity is the ability to duplicate historical 
real-world population behavior under comparable input conditions. In con- 
ducting such tests, the population is initialized to duplicate the distribution 
of all relevant parameters as they existed at a specified point in time in the 
real-world environment. In the case of the Figure 13 run the artificial 
population had been initialized to correspond to conditions existing at the 
beginning of 1961. 

Inputs to the simulation during performance tests described conditions 
existing in the real world during the referenced time period. In the test 
illustrated in Figure 13 conditions were those existing during 1961. Inputs 
specified the content and related media allocation for all journal, direct mail, 
and detail man promotion generated by competitors operating in the market 
during 1961. In addition, drug characteristics and distribution conditions 
were established to correspond to conditions in the real world during 1961. 

Analytical procedures applied in performance testing may be summarized 
with reference to data plotted in Figure 13. The first test performed follow- 
ing this simulation established that the rank order of drug shares at the end 
of 1961 in the real and simulated worlds were equivalent. Actual simulated- 
data comparisons are presented in Figure 14. 



C OMPITTRIZFD MICRO \NAt YTIC SIMULATION’S OP M \RKFT BEH U IOU 259 


Identification 

Rank as 
Initialized 

Year End Rank 

Simulated 

Actual 

1-Y 

4 

2 

2 

1-0 

5 

6 

6 

1-X 

6 

5 

5 

l-~ 

8 

8 

8 

!-□ 

10 

10 

10 

2-Q 

1 

1 

1 

2-0 

2 

3 

3 

2-X 

3 

4 

4 

2-4* 

7 

7 

7 

2-Y 

9 

9 

9 


Impure 14 Drug rank order companion. 


The absolute value of therapeutic class shares generated by the simulation 
and real-world population were then examined. As indicated in Figure 15, 
the total error between actual and simulated drug shares at the end of 1961 
was 5.1 per cent. 


Initialization 
Identification Value (%) 


Year End Value (%) 
Simulated Actual 


Difference 
(magnitude) (%) 


1-Y 

1-0 

1-X 

1- 4 
!-□ 

2- D 
2-0 
2-X 
2-^ 
2-Y 


13.7 

15.0 

9.7 

9.1 

7.3 

9.3 

5 0 

3.2 

0 

0 

23.2 

27.6 

18.1 

13 0 

1 5.0 

33.9 

0.2 

5.9 

1 0 

2.5 


16.1 

-1.1 

8.7 

-f 0.4 

9.0 

-1-0.3 

28 

— 0.4 

0 

— 

28 8 

— 1.2 

12.7 

-4 0.3 

14.4 

-0.5 

5.5 

-f 0.4 

2.0 

-^0.5 


99. S 99.5 100 0 


5.3 


Figure 15. Absolute share of therapeutic class companion 


A final class of performance tests focuses on the extent of correspondence 
between actual and predicted market shares throughout the entire time period 
covered by the simulation Figure 16 illustrates the procedure used to obtain 
this measure for the 1961 simulation test data. The minimum error in simu- 
Ution-b.i«cd prediction for anv drug was 5.2 per cent, whereas the average 
error over this time period was 0.7 per cent. 



260 


ARNOLD E. AMSTUTZ 



E |actual ( d , t ) — simulated {d y /)|. 

1 

Average error for 10 drugs: 

10 

X total deviation 

f/c l . 

520 

Figure 16. Over-time market share deviation — measurement illustration. 

4, MANAGEMENT USES OF SIMULATION 

Given systems of the type described in this article, management must assess 
system performance in terms of intended applications. If, in their opinion, 
performance is sufficient to warrant use of the simulation as a representation 
of the real-world environment, applications of the type outlined below may 
be appropriate. If, in their opinion, however, the simulation fails to duplicate 
salient attributes of the real-world environment, further development leading 
to a more refined system must be undertaken or the use of the technique 
rejected. 

Testing Implicit Models 

One of the first benefits to accrue from the development of a simulation 
system is the systematic testing of management conceptions of the environ- 
ment in which they operate. In reviewing alternative formulations and 
evaluating functions, cell-model behavior, and total population performance, 
management must make explicit the often implicit models on which their 
decision making is based. 

The “ What If?” Question 

Given that management accepts simulation performance as indicative of 
real-world response under comparable conditions, the simulation becomes a 
test market without a memory in which management may examine with 
impunity the implications of alternative policies and strategies. Whether 
introducing new products or considering modification of a marketing pro- 
gram, management may apply alternative strategics in the simulated environment 
and evaluate their implications under various assumed competitive conditions. 



COMPUTERIZED MICRO ANALYTIC SIMULATIONS OF MARKET BEHAVIOR 261 


The effectiveness of such pretesting is dependent on management’s ability 
to predict probable competitive responses to proposed actions as well as the 
accuracy of the simulation system. Management may find it profitable to 
examine the impact of best- and worst-case competitive response patterns. 
In most instances the best case assumes that competition will continue with 
programs developed before the initiation of company actions, whereas the 
worst case assumes full competitor knowledge of the proposed company 
program and combined action to thwart company efforts. 

Performance References 

The simulated environment provides the references against which the 
progress of operations in the real world may be measured. Given a simulation 
pretest, management can determine by monitoring appropriate variables 
whether or not a program is progressing as planned. If conditions producing 
satisfactory performance in the simulated environment are encountered in 
the real world, it is assumed that final results will be comparable. Differences 
between simulated and experienced results are viewed as potential bases for 
failure to achieve real-world performance comparable to simulation results. 


5. SUMMARY 

This article has examined procedures followed in developing, testing, and 
validating computerized microanalytic behavioral simulations. The process 
of boundary definition, macro and micro behavior description, and decision- 
function formulation has been discussed with reference to sample system 
structures. System performance characteristics and procedures for simulation 
validation at the function, cell, and population level were considered in con- 
text of output obtained from three representative microanalytic simulation 
based systems. It has been suggested that such systems have the potential 
to contribute significantly as vehicles for testing preconceptions regarding 
complex environments, evaluating the implications of alternative policies and 
strategics, and providing performance references against which the effective- 
ness of implemented plans may be assessed. 


6. REFERENCES 

(1] A more detailed discussion of work uith one representative company is provided 
in A. E. Aimtutz and H. J. Cfaveamp. ** Simulation Techniques in Analysis of 
Marketing Strategy,” in Applications of the Sciences in Marketing Management, 
Purdue University, Lafayette, Jnd.» 1966. 

[2] A thorough discussion of this process is provided in A. E. Armtutz, Computer 
Simulation of Competithe Market Response, M.l.T. Press, Cambridge, Mass., 
1967. 

13] In put -associated decisions associated with a wide range of consumer and industrial 
products arc discussed in A. E. Amstutz, "The Manufacturer — Marketing Deci- 
sion Maker,” ibui. % Chapter 7. 

[4] 'Hi is decision structure is considered extensively in A. E. Aimtutz, ”A Model 
of Consumer Behavior,” ibid.. Chapter S, 



262 ARNOLD E. AMSTUTZ 

[5] The derivation and use of this and other intermediate measures of consum 

disposition are discussed in A. E. Amstutz, “ Quantification of Marketing P ^ 
cesses,” ibid., Chapter 5. ro * 

[6] If the number drawn is less than or equal to the stated probability, a positive 
outcome is assumed. 

[7] The awareness measure used in this system is indicative of the respondents’ 

top-of-mind cognizance determined by eliciting the name of the first brand in 
a product class that “comes to mind,” in A. E. Amstutz, “ Quantification of 
Marketing Processes,” op. cit Chapter 5. 

[8] A five point (0-4) prominence scale is used to code content of all communications 
inputted to the model. Each communication is evaluated by using the following 
coding structure: 

Level of Prominence Evaluation Scale 


Extremely prominent — impossible to miss 4 

Very prominent — major emphasis given 3 

Average prominence — normal identification 2 

Present but not prominent — easily missed 1 

Not present — impossible to determine 0 

A. E. Amstutz, “The Manufacturer — Marketing Decision 
Maker,” op. cit., Chapter 7. 

[9] This system is discussed extensively in A. E. Amstutz and H. J. Claycamp, loc. 
cit. 

[10] Turing has suggested that if a person knowledgeable in the area of simulated 
decision making cannot distinguish the modeled behavior from reality the model 
is realistic. See A. M. Turing, “ Computing Machinery and Intelligence/’ MIND , 
433-460 (October 1950). 


DEVELOPPEMENT, VALIDATION, ET MISE EN 
EXECUTION DE SIMULATIONS MICRO- 
ANALYTIQUES DU COMPORTEMENT D’UN 
MARCHE A L’AIDE D’UN ORDINATEUR 


RfiSUM £ 

Cette these passe en revue le developpement, les epreuves, et la mise en 
execution de simulations micro-analytiques a large echelle du comportement 
du marche, simulations qui utilisent un calculateur electronique. Les prob- 
lemes associes avec la conception et les specifications du modele, la verifi- 
cation de la fonction et avec la mise en epreuve du systeme et des sous-systemes, 
et avec la validation du modele sont compares avec d’autres systemes 
operatifs. On evaluera quelques exemples de resultats obtenus dans des 
marches regionaux, et on discutera les possibilites duplication du systeme 
pour Tadministration. 



COMPUTER-AIDED PREPARATION OF MAGAZINE ADVERTISEMENTS 


263 


COMPUTER-AIDED PREPARATION OF MAGAZINE 
ADVERTISEMENT FORMATS 

Preparation des Formes de Publicity dans les 
Magasines a V Aide d'nn Ordinal ear 

Daniel S. Diamond! 

Sloan School of Management 

Massachusetts Institute of Technology , Cambridge , Massachusetts 
United States of America 


Although most attempts toward quantification in advertising have involved 
budgets, media selection, and media scheduling, this article deals directly 
with magazine advertisement design. It is first proposed that before an adver- 
tisement can have any effect on the consumer it must attract his attention. 
Readership is suggested as a measure of the attention-getting power of an 
advertisement. Next, a set of six Starch Magazine Advertisement Readership 
models is constructed by means of multiple regression analysis on 1070 
advertisements appearing in Life magazine. The independent variables arc 
advertisement format descriptors, such as size, number of colors, and position 
in magazine — 12 in all. Finally, a conversational computer program is de- 
veloped which requests from its user a readership objective function (a func- 
tion of the six Starch readership scores for the advertisement and its cost), 
format restrictions, and a budget constraint. The program then prepares 
that advertisement format which, although conforming to all restrictions, 
maximizes the objective function. 


1. INTRODUCTION 

1.1. Attention-Getting Power and Advertisement Readership 

The function of advertising is to give the consumer information about a 
product, to remind him of its existence, and to persuade him to buy it. Most 
would agree that “ what he [the advertiser] is really concerned about is 
whether one advertisement or one advertising campaign will produce more 
sales — or fewer sales — than another ad\ertisement or another campaign*’ [2]. 
Before an advertisement can produce n sale, however, it must attract the con- 
sumer’s attention. Only when this has been accomplished can the content 
of the advertisement have any effect whatsoever on the consumer. For 
present purposes it is assumed that the degree to which the consumer’s atten- 
tion is attracted toward an advertisement may be measured approximately 

t The work reported here ba^ed in part on the author's ZLS. thcris st M.I.T, [1]. 



264 


DANIEL S. DIAMOND 


by readership scores. Although the readership of an advertisement and its 
attention-getting power may not be identical, readership scores present a 
convenient and probably fairly accurate measure of this attribute. 

1.2. Starch Magazine Advertisement Readership Scores 

Readership measures of magazine advertisements are made by various 
commercial organizations. Those used here were made by Daniel Starch and 
Staff with a technique known as the recognition method, one of several 
approaches to advertisement readership (e.g., see [3], pp. 647-651). Starch 
scores are widely used by advertising agencies and marketing personnel for 
measuring advertisement readership. Three ratings are available: 

noted. The per cent of readers who remembered that they had seen the 
advertisement in the particular issue. 

seen— associated. The per cent of readers who have seen or read any part 
of the ad which clearly indicated the product or advertiser. 
read most. The per cent of readers who read 50 per cent or more of the 
written material in the ad. 

For magazines of general interest the three scores are measured for both 
men and women readers. For magazines like Business Week only men’s scores 
are measured and for magazines like Ladies Home Journal only women’s 
scores are measured. Scores are based on 100 to 200 interviews, depending 
on the publication. 

For each Life readership study interviews are conducted with 150 men 
and 150 women. The interviewers are told that the respondents must be 
18 years of age or older and further that 

. . you are assigned a specific number of interviews (quota) with men and 
women and we also want you to interview people of varied ages, different 
income levels and occupations. Interview in all types of residential areas. 
Do not concentrate on one neighborhood one week and another the next. 
Obtain a good cross-section of respondents each week,” 

The company feels that over a period of many issues the demographic 
characteristics of readers who are interviewed parallel the characteristics of 
the primary audience of this magazine [4]. 

1.3. Reproducibility of Readership Scores 

The selection of Starch interviewees is nonprobabilistic, and for this 
reason the question has been raised whether the method would give repro- 
ducible results. According to Boyd and Westfall ([3], p. 650), the Advertising 
Research Foundation 

. . replicated the methods used by the Starch organization to obtain their 
readership ratings. A single issue of Life was selected and probability 
sample of over six hundred readers of the issue using his regular techniques. 
The correlation of women’s noted scores on the ninety-six full-page and larger 
ads (measured both by ARF and Starch) was 4-0.92.” 



COM PITER- AIDED PREPARATION OF MAGAZINE ADV ERT1SEMENTS 


265 


1.4. Elements of an Advertisement 

It was pointed out earlier that it is only when an ad\crtiscmcnt has 
successful!} attracted the consumer’s attention that its content can have any 
cfTect on him. This suggests that it is not the content of the advertisement 
that draws the reader hut something else. It is proposed here that it is the 
format of an advertisement that produces readership. Thus an advertisement 
Is considered to have two elements: format , that which attracts the reader’s 
attention, and content „ the message contained in the advertisement. More 
accurate!), as will become clear, the word “content” refers to the advertise- 
ment’s message; “format” has the meaning “everything that is not content.” 
The undertaking reported in this article is that of developing a technique for 
designing advertisement formats. The selection of advertisement content, 
although not totally independent of format design, is not addressed here. 

2. READERSHIP MODELS 
2.1. Prediction of Advertisement Readership 

The procedure used to design the format of an advertisement (actually to 
select one format out of many possibilities) will depend on the ability to pre- 
dict readership scores for an advertisement predominantly on the basis of its 
proposed format. Some work has been done in this area. Two examples arc 
Twedt’s article, “A Multiple Factor Analysis of Advertising Research” [5], 
and Yamanaka’s monograph, “A Method of Prediction of Readership Score 
(Newspaper Ads)” [6]. 

Twcdt proceeded approximate^ as follows: first, 34 advertising \ariab1cs 
[15 “mechanical” and 19 “content” — Twedt’s terminology] were defined. 
Using 137 advertisements contained in an abbreviated survey issue of The 
American Builder (February 1950), the Advertising Research Foundation 
conducted a readership study. Each of the 34 variables was correlated in- 
dependently with readership. Nineteen variables were selected as being 
significantly correlated with readership. A correlation matrix of readership 
and these 19 variables was constructed and factor-analyzed. Certain of the 
19 variables were then selected on the basis of being “factorial!}' purest, and 
which also offered most promise for prediction of advertising readership.” 
Finally, three variables — sire of advertisement (in pages), number of colors, 
and square inches of illustration, all format variables — were selected “as pro- 
viding maximum prediction with minimum trouble of measurement.” When 
readership was regressed against these three variables, the resulting correla- 
tion coefficient was .76. The inclusion of nine additional variables (presum- 
able tho^e with the next highest factor loadings) produced a correlation co- 
efficient of .79. Concluded Twcdt, “The gain of .03 is obviously not worth 
the time involved in making these additional measurements.” After the 
analysis of The American Builder , the three-variable regression equation was 
applied to one issue each of five other publications: Automotive Industries , 
. 1 mm car. Machinist , Chemical Krgineerin g. Business Week, and Successful 
Farming. The average correlation coefficient for all six issues was .71. 



266 


DANIEL S. DIAMOND 


Yamanaka introduced a technique for quantifying the qualitative aspects 
of an advertisement. This quantification problem is usually handled by the 
use of what are commonly called “ dummy” variables [7]. The techni 
used by Yamanaka was developed by Hayashi [8]. It is first assumed that 
readership can be described by a linear combination of advertisement de 
scriptors. Given linearity, conditions for the maximization of the correlation 
coefficient between actual and predicted scores are derived from the point of 
view of the quantification of qualitative variables. By use of this technique 
on advertisements from two Japanese morning newspapers, the Ckubu-Nippon 
Shimbun and the Asahi-Shimbuti , correlation coefficients of .91 and .89, respec- 
tively, were produced. The variables were “ space, ” page of appearance 
position on the page, “layout,” evaluation of design, evaluation of headline 
(both evaluations were performed by Dentsu Advertising copywriters), and 
past advertisement copy linage. 

2.2. Variable Selection 

These two examples (particularly that of Yamanaka) demonstrate that 
reasonably good linear predictive models of advertisement readership can be 
obtained. On this basis the Starch readership score models required for this 
project have been constructed by means of multiple regression analysis. The 
source of data was Life magazine: for 1070 of the 1197 advertisements appear- 
ing in Life between February 7 and July 31, 1964, values for the 12 variables 
listed in Table 1 were measured. All advertisements appearing in this interval 
were used as data except those that promoted contests, special offers, etc., 
multiple products advertisements, and advertisements for which any variable 
value could not be determined. Data for the dependent variable, six Starch 
scores for each ad, were provided by Daniel Starch and Staff. 

Table 1 

Variables Used hi Regression Analysis 


1 . Product class 

2. Past advertising expenditure 

3. Number of ads in issue 

4. Number of pages 

5. Number of colors 

6. Bleed-no bleed 

7. Left or right page 

8. Position in magazine 

9. Layout 

10. Number of words 

11. Brand prominence 

1 2. Headline prominence 


Although the readership models are intended to predict Starch scores 
for an advertisement on the basis of its format, two of the 12 variables are 



COMPUTER-AIDED PREPARATION OF MAGAZINE ADVERTISEMENTS 


267 


related to format only in its broad sense, “everything that is not content.” 
One of the two, product class, has been chosen because of the obvious wide 
variation in Starch scores for products in different product classes. The 
second, past advertising expenditure (the promotional expenditure for the 
product in national media during the calendar \car preceding the year of the 
advertisement publication), was selected in an attempt to have some measure 
of product awareness. Two additional variables, brand and headline promin- 
ences, may seem to reflect content rather than format, but because they tend 
to indicate the approximate size of the brand name or logo and headline of 
an advertisement they have been included. Definitions of all variables, except 
number of ads in issue, which is simply the number of advertisements in the 
issue competing for the reader's attention, are presented in Tables 2 
through 12. 

Table 2 
Product Class 


State 


Beer, ale, liquor I 

Passenger cars 2 

Automotive accessories, gas, oil, tires, trucks, other 3 

Building materials, paint, wallpaper, flooring 4 

Food 5 

Household furnishings and supplies 6 

Insurance and finance 7 

Machinery', metals, industrials, business machines, 

public utilities S 

Pharmaceuticals 9 

Radio, television, electronics, audio equipment 10 

Tobacco and related products 11 

Men's clothing 12 

Women's clothing 13 

Men’s toilet goods 14 

Women’s toilet goods 15 

Clothing accessories, general toilet goods 16 

AH others 17 


Table 3 

Past Advertising Expenditure [10] 


Expenditure Range 


(x $100,000) State 


0-4 1 

5-10 2 

11-18 3 

19-29 4 



Table 4 


Number of Pages ' f 



State 

One page 

1 

Two pages 

2 

One-half page (vertical) 

3 

One-half page (horizontal) 

4 

All others 

5 


t Note that this variable reflects both size and orientation. 


Table 5 

Nwnber of Colors 


State 

Black and white 

1 

Two colors 

2 

Full color 

3 

Table 6 


Bleed-No Bleedj 



State 

Advertisement does not use bleed 

1 

Advertisement makes use of bleed 

2 


f Bleed is the elimination of white page margins. 


Table 7 

Left or Right Page 


State 


Does not apply (two-page ad, covers, etc.) 1 
Half- or full-page ad on left-hand page 2 

Half- or full-page ad on right-hand page 3 


268 








Table S 

Position in Magazine 



State 

First quarter 

1 

Second quarter 

2 

Third quarter 

3 

Fourth quarter 

4 

Inside front cover 

5 

Inside back cover 

6 

Outside back cover 

7 


Table 9 
Layout 


State 


One large illustration 1 

One small illustration 2 

More than one small illustration 3 

One large photograph 4 

One small photograph 5 

More than one small photograph 6 

One large illustration and one small illustration 7 

One large illustration and more than one small illustration 8 

One large illustration and one small photo 9 

One large illustration and more than one small photo 10 

One large photo and one small illustration 11 

One large photo and more than one small illustration 12 

One large photo and one small photo 13 

One large photo and more than one small photo 14 

One small illustration and one small photo 15 

One small illustration and more than one small photo 16 

One small photo and more than one small illustration 17 

More than one small illustration and more than one small photo 18 

No photograph or illustration 19 


Table 10 
Number of Words 


State 


Fewer than fifty words 
Fifty or more words 


1 

2 


269 








270 


DANIEL S. DIAMOND 


Table 11 

Brand Prominence 


Brand Name or Sponsor Is 

State 

Nor present 

1 

Drfncolr to detect 

2 

Easy to miss 

3 

Easy to detect 

4 

Almost impossible to miss 

5 


Table 12 


Headline Prominence 


Headline Is 

State 

Not present 

1 

Difficult to detect 

2 

Easy to miss 

3 

Easy to detect 

/ 

Almost impossible to miss 

5 


23. Presentation of and Comments on Final Models 

The final Starch readership score models, that is, regression coeSdeTAL 
appear with indications of statistical significance in Tables 13 through 25. 
Variable states omitted from the regression as a means of preventing the 
matrix singularity that would otherwise be caused by using dummy ratable? 
are indicated by dashes. Because the readership models have been construct: 
as a means to an end rather than an end in themselves, a discussion of treat 
is limited to one brief statement about the relationship of each independent 
variable to readership. 

1. As expected, there are marked differences in product class readers tip 
between men and women and among different product classes. (Refer to 
Table 13.) 



COMPUTER-AIDED PREPARATION OF MAGAZINE ADVERTISEMENTS 


271 


Table 13 


Product-Class Coefficients 




Men Readers 



Women Readers 



Seen- 



Seen- 


State! 

Noted 

Associated 

Read Most 

Noted 

Associated 

Read Most 

1 

11.5905! 

12.3707J 

4.7943! 

- 19.1961! 

-19.9756! 

-6.726SJ 

2 

19.5057! 

21.4411; 

7.6697; 

-21.1663; 

-22.5102! 

-6.5595! 

3 

11.67621 

12.6520; 

6.5064! 

-24.5510; 

-26.6543! 

-6.7979; 

4 

5 

6 

9.0203! 

6.4002§ 

2.8277 

— 1.6464 

— 6.20S5§ 

-0.9433 

-1.2471 

— 0.9323 

0.459S 

-0.3434 

-0.1651 

0.6443 

7 

1.9910 

0.9S61 

1.9S19J 

-16.5995 

-19.0118! 

-4.6676! 

8 

3.9241 J 

3.6092! 

2.8454! 

-10.4453; 

— 12. $$30! 

-2.5963! 

9 

3.06985 

2.2312 

1.50135 

-4.7141; 

-5.8439! 

-1.4719 

10 

8.3439! 

8,5984! 

3.SS0$t 

-8.5532! 

-9.5103! 

-2.7763! 

11 

5.7084! 

6.3266! 

0.2650 

— 1S.S90J t 

-19.429S! 

-7.1611! 

12 

14.7261! 

14.5064! 

6.260S1 

-9.7710; 

-10.S504! 

— 1,3381 

13 

-5.646S5 

-11.4612! 

-1.1344 

10.35S2! 

9.30S4; 

6.4042! 

14 

12.3581! 

12.9834! 

6.1 19SI 

-16.4946! 

-17.2523! 

-3.6097! 

15 

-12.8196! 

— 17.8393; 

—2.9818! 

0.337S 

-1.664$ 

1.7306 § 

16 

4.5563! 

4.7286! 

1. 9580 § 

-5.1551! 

-6.3054! 

-2.3381! 

17 

6.7455! 

6.616S! 

4. 1819! 

-9.S776! 

-10.S927! 

-1*41701 

f Refer to Table 

2. 





+ Significant at the 1% level. 

§ Significant at the 5% level. 





2. 

The past advertising expenditure for 

a product docs not seem to afTect 

readership to any great extent* 

It is apparent, however, that the cfTcct is non- 

linear. 

(Refer to Table 14.) 








Table 14 






Past -A dver rising- Expenditure Coefficients 




Men Readers 


Women Readers 



Seen- 



Seen- 


State! 

Noted 

Associated 

Read Most 

Noted 

Associated 

Read Most 

1 

2 

-0.9371 

- 0.55 19 

-1.3266! 

2.77111 

2.S644J 

0.6504 

3 

0.8117 

1.5566 

- 0.9577 § 

-0.3117 

-0.0667 

-1.1339$ 

4 

!.90$0§ 

2.5S15J 

-0.3631 

2.1297 1 

2.6874! 

- 0.3627 

5 

2.743S; 

3.3250; 

0.3980 

5.6S50; 

6.5165! 

0.7578 


* Refer to Table 3. 
t Significant at the 1% level. 
§ Significant at the 5% level. 








272 


DANIEL S. DIAMOND 


3. For each Starch score the greater the number of ads in the issue the 
lower the score. The effect decreases from the ‘‘noted” score to “seen 
associated” to “read most,” and is greater for men than for women (R e f 
to Table 15.) ' ' 


Table 15 


Number-of-Ads-in-Issue Coefficients 



Men Readers 


Women Readers 


Seen— 



Seen- 

Noted 

Associated 

Read Most 

Noted 

Associated Read Most 

-0.13951 

-0.12331 

-0.08021 

— 0.1056t 

— 0.0959f — 0.0524f 


1 Significant at the 1% level. 


4. The contribution of advertisement size to readership descreases in 
the following order: double-page, “other,” single-page, horizontal half-page, 
and vertical half-page. The score-to-score and men-to-women relationships 
arc as in (3). (Refer to Table 16.) 


Table 16 

Number -of- Pages Coefficients 


Men Readers 


Women Readers 


Seen- Seen- 

Statef Noted Associated Read Most Noted Associated Read Most 


1 

2 16.76551 

3 —6.89141 

4 -6.58031 

5 8.48761 


15.12071 

-5.86311 

-5.85161 

6.98131 


4.02881 

-1.39691 

-0.5948 

2.2357 


12.94811 

-9.430S1 

-9.16471 

8.78621 


12.70321 1.8483 

-8.65121 -2.32811 

-8.53281 -2.3898§ 

8.95031 2.6052 


t Refer to Table 4. 

1 Significant at the 1% level. 
§ Significant at the 5% level. 



COM ri*TER- AIDED PREPARATION* OF MAGAZINE ADVERTISEMENTS 


5. The greater the number of color? in an advertisement, the greater its 
readership, except in the case of the “read most*’ score, on which color has 
little effect. Color affects women more than it does men. (Refer to Table 17.) 

Table 17 


Xu n : b cr - of- C o !o rs Confidents 



Men Readers 


Women Readers 

Statef Noted 

Seen- 

Associated Read Most 

Noted 

Seen— 

Associated 

Read Mon 

1 -7.tS?6t 

2 ~-?.2627t 

3 


-11.94631 

-7.6150: 

-10.3119: 

-6.461$: 


f Refer to Table 5. 

X Significant at the 1% level. 




6. The elimination of white page margins (bleed) increases 
but very little. (Refer to Table 18.) 

readership, 


Table IS 





Bleed-Xo Bleed Coefficients 




Men Readers 


Women Readers 

Siatcf Noted 

Seen— 

Associa ted Read Mo*t 

Noted 

Seen— 

Associated 

Read Most 

1 

2 U66U 

0.4975 0.9522? 

2.3! 06 1 

1.9S&4? 

!.2123§ 


t Refer to Table 6, 

X Significant at the 5% level. 
§ Significant at the loci. 


7. An advertisement of one page or less will receive higher readership 
scores if it appears on a right-hand rather than a left-hand page. (Refer to 
Table 19.) 




274 


DANIEL S. DIAMOND 


Table 19 

Left - or Right-Page Coefficients 




Men Readers 



Women Readers 

State* 

Noted 

Seen— 

Associated 

Read Most 

Noted 

Seen- 

Associated 

Read Mm: 

1 

2 

3 

-7.1476* 

-3.1159* 

-6.6993* 

-2.7924* 

-2.5641 

-0.2965 

-4.1485 

-3.5151* 

—4.98805 

-3.0433} 

-0.3173 

-0.1665 


t Refer to Table 7. 

* Significant at the 1% level. 

§ Significant at the 5% level. 

8. The contribution to readership of the position of an advertisement 
in an issue descends in the following order: outside back cover, inside front 
cover, inside back cover, first, second, third, and fourth quarters. The 
score-to-score and men-to-women relationships are as in (3). (Refer to 
Table 20.) 

Table 20 

Position-in- Magazine Coefficients 


Men Readers Women Readers 




Seen- 


Seen- 

State* 

Noted 

Associated Read Most 

Noted 

Associated Read Most 


1 


2 

— 1.4177} 

-1.2969 

-0.5980 

-0.7769 

-0.2797 


3 

- 2.6807 § 

— 2.3602§ 

-0.2671 

— 2.2854§ 

-1.6745* 


4 

- 2.9507 § 

-2.2841 § 

-0.5530 

-2.7811 § 

— 2.1 355 § 


5 

11. 5497 § 

9.0402 § 

0.9295 

12.6888 § 

11.5751 § 

1.3010 

6 

3.7147 

3.2280 

1.5280 

0.5015 

-0.1561 


7 

30.5751 § 

29.7921§ 

3.7246§ 

21.6501 § 

21. 6207 § 

-1.3748 


t Refer to Table 8. 

* Significant at the 5% level. 
§ Significant at the 1% level. 


9. In general, an advertisement will receive higher readership scores if 
it contains a photograph rather than an illustration, either of which is better 
than neither. The effect is greatest in the “noted” score and smallest in the 
“read most” score. (Refer to Table 21.) 





COMPITER-AIDED PREPARATION OF MAGAZINE AD’*TRTI5MENT? 


275 


Table 21 
Layout Coefficients 


Men Readers Worsen Readers 


State* 

Noted 

Seen- 

Associated 

Read Mon 

Noted 

Seen— 

Annotated 

Read Mo«.t 

1 

-2.1009 

1 1604 

0 9702 

- 3 9270 

*4 14791 

-i.isoo 

2 

- 5.5272 § 

- 4 0761 1 

0 6212 

6 9972J 

5 1664t 

~1 7612 

3 

- 5.7626 $ 

- 4 8930$ 

0 7962 

-6 120SS 

4 6741 5 

0.4415 

■? 

5 

— 1 5SS7§ 

4 15525 

0 20% 

- 2 $756 

- 1 92S7 

2.7534! 

6 

-4.54S0S 

-4 1097S 

1 46165 

- 1 69S0 

-0 6418 

0.3764 

7 

- 2465$ 

-2 7533 

- 1 SI 39 

- 5 6 $n?: 

— • 7865: 

- 1.3387 

8 

-S3204! 

-SA159.5 

-2 6524 

1.4361 

*0 8015 

-0.6603 

9 

3.7575 

4.33S5 

-4 3Q5S1 

- 9 2201 : 

- 8 7606 S 

-4.70401 

10 

-7.63551 

-5 0363 

4 7727 S 

- 7 33S*4 

-4 9051 

-1.2240 

11 

-0 6046 

-09141 

-*0 8969 

-2.1214 

-3.9517 

-1.4742 

12 

— 2.8576 

-2 4345 

* 0 7553 

-3 3222 

-2.6729 

-1.9749: 

13 

-2.043 Of 

— 2.0S4S5 

-2.34195 

~ 0.4754 

-0.1307 

-1 46S6S 

1*4 

— 3.0070 § 

-3.18*40$ 

— 1.5S065 

-0.0063 

0 6124 

-0.3627 

15 

-0.7949 

0.7744 

-1.3216 

-2.7569 

-1.25S6 

-0.7010 

16 

-7.5S20? 

-7.04393 

-0 8651 

-8.393S 5 

- 7.3309 S 

-1.9160 

17 

-3.1132 

-2 0494 

-0.6427 

-11 A373 5 

-10.7137$ 

— 2.1212 

IS 

-3.5014 

-4 0371 

-2.6726 

-0.904$ 

0 6199 

-0.9712 

19 

— 10.5719 S 

- 7.71404: 

0 1231 

— 11.54025 

— 7.S976 1 

-0 7260 


+ Refer to Table 9. 

J Significant at the S°i level, 
f Significant at the l 0 o Ic^cl. 

10. As the number of words of text in an adiertiscmcnt becomes greater 
than 50 r readership goes down. This i$ most applicable to the “read most” 
score. (Refer to Table 22.) 

Table 22 

Xu ?*zbcr -of- f f orJs Coefficients 

Men Reader^ Women Reader? 


Seen- Seen- 

State 4 Noted A**ocu?ed Read Mo*»t Noted .Wnciatcd Read Most 


I 1.9764! 1.7760J 5 0123 S -*.32405 4 $185$ 5.3r«oj 


* Refer to Table 10 
t Stern Scant at the 5' lc’»el 
f Significant ax the 1% lc* rb 






276 


DANIEL S, DIAMOND 


11- The prominence of brand identification in an advertisement 
to have little effect on readership- (Refer to Table 23,) 

Table 23 

Brand-Prfjminence Coefficients 


Men Reader-, Women Readers 




Seen— 





State* 

Noted 

Associated 





1 

9 

-0.9569 

-1.3820 

2.0120 

0.7828 

0.7944 

1.4216 

Ar 

3 

-0.0637 

0.8281 

-0.9069; 

-0.2548 

0.5085 

— 0.8025J 
—0.2995 

4 

-0.4582 

1.0302 

-0.8194 

-1,1429 

0.2711 

5 

-0.0156 

1.6474 

— 1 ,0d2 8 j 

0.1647 

1.8548 

—0.4859 


* Refer to Table 11, 
t Significant at the 1 % level. 
5 Significant at the 5% level. 


12. Headline prominence has little effect on readership except in the 
“read most” score in which the total absence of a headline can increase text- 
reading significantly- (Refer to Table 24,) 

Table 24 

Headline-Prominence Coefficients 


Men Readers Women Readers 


State* 

Noted 

Seen— 

Associated 

Read Most 

Noted 

Seen— 

Associated 

Read Mwt 

1 

-0.1317 

-0.8529 

3 . 4003 ; 

1.4736 

0,0716 

4.5826: 

2 

0.6950 

0.6491 

0.1897 

1.1429 

1.0075 

0.6797 

3 

1.1217 

1.32211 

-0.2419 

0,1730 

0,0027 

—0.0664 

4 

5 

-0.6936 

-0.4270 

0.3745 

-1.3232 

-1.0931 

0.0383 


* Refer to Table 12. 

% Significant at the 1% level. 
? Significant 2 t the 5% level. 


Table 25 


Regression Equation Constant Terms 



Men Readers 



Women Readei 

rs 

Noted 

Seen— 

Associated 

Read Most 

Noted 

Seen- 

Associated 

Read Most 

42.0259 

35.5477 

9.6826 

57.9110 

51.8479 

12.7425 



scores 



Figure I. Validation of readership models. 


277 










278 


DANIEL S. DIAMOND 


2.4 . Model Validation 

In order to test the predictive ability of the readership models, they have 
been used to predict scores for 43 advertisements appearing in the February 
26, 1965, issue of Life. The set of advertisements used for validation appeared 
nearly eight months after the most recent advertisements used as input to th* 
regression; that is, they are totally independent of the original 1070 advertise- 
ments. Because the test used here is probably the most stringent to which a 
predictive model can be put, the performance of the six models on the teat 
advertisements should be a good indication of their predictive power 
Figure 1 presents plots of predicted versus measured scores for the set of test 
advertisements. The coefficients of multiple correlation and multiple deter- 
mination corresponding to the plots in Figure 1 are presented in Table 26. 


Table 26 

Validation of Readership Models — Coefficients of Multiple 
Correlation (R) and Multiple Determination (R 2 ) 




Men Readers 



Women Readers 



Seen- 



Seen— 



Noted 

Associated 

Read Most 

Noted 

Associated 

Read Most 

R 

0.8240 

0.8297 

0.5643 

0.8581 

0.8588 

0.7507 

R 2 

0.6790 

0.6884 

0.3185 

0.7363 

0.7376 

0.5636 


In brief, it seems fair to say that four of the models, all except the two for 
the “read most” score, are relatively good. In each of these four models 
68 per cent or more of the variance in the readership score has been 
accounted for. The two “read most” models, on the other hand, are not very 
good. This is probably explained by considering the meaning of the “read 
most ” score. A person reading an advertisement trill read more than half its 
text only if the first half is of sufficient interest to him. What is obviously 
lacking is some sort of content analysis. 


3. THE READERSHIP MODEL AND ADVERTISEMENT 
FORMAT DESIGN 


3.1. Form of the Model 

An attempt has been made to construct a set of linear models such that 
one Starch score for a single advertisement appearing in Life may be predicted 
by the expression 

Rn = C+ 2 Wi + 2 d h y h + c 

i J 






280 


DANIEL S. DIAMOND 


considerations lead to the concept of a readership objective function that can 
have either of the following forms: 


(I) 


6 


v=z 

i=i 


( WiRi ), 


(II) 


c 




WiRi 

C 


w'here Wf is a w'eight (or importance) assigned to the zth Starch score R ( and 
C is the cost of inserting the advertisement. The Wf are arbitrarily required 
to sum to one. The second form is used when readers-per-dollar is to be 
maximized. As an example, suppose an advertiser find himself with a food 
product, whose promotion he feels (or his research has shown) should be 
aimed at women and men in a ratio of, say, 7 to 3. Also, for the particular 
advertisement in question he w^ants as many women as possible to read the 
text of the ad, which contains several recipes that present new 'ways to 
prepare the product. As far as men readers are concerned, the advertiser 
will be satisfied with producing brand-name remembrance. In a mathematical 
programming sense, then, the readership objective function for the advertise- 
ment would be 0.3i?2 0.72?6 , where is the “ seen-associated ” score for 

men readers and Re is the “read most” score for women readers. If the 
advertiser wished to maximize readership per dollar, the objective function 
w r ould become (0.3i?2 + 0.7Re)IC. 


3.4. Mathematical Programming Problem 

The problem of selecting an advertisement format, which, although 
subject to restrictions, is to maximize a readership objective function, may be 
viewed as a mathematical programming problem. There are, however, several 
differences between the typical mathematical programming problem and the 
problem encountered here. One is due to the incompatibility of certain vari- 
able values; for example, a cover cannot be a half-page ad. Furthermore, 
costs follow' no simple functional form, so the cost of a format must be 
looked up in a table [9]. 


4. THE C-A-P-M-A-F SYSTEM 


4.1. Description 

The preparation of magazine advertisement formats has been implemented 
by a set of conversational computer programs given the name C-A-P-M-A-F 
(computer aided preparation of magazine advertisement formats), operating 
under the M.I.T. compatible time-sharing system. The C-A-P-M-A-F 
system consists of a main program, seven subroutines (all in FORTRAN), 
and a file containing the readership models and other necessary data. Because 
the system operates conversationally, all the user needs to know about the 
time-sharing system is the procedure for requesting program loading and 



COMPUTER-AIDED PREPARATION Or MAGAZINE ADVERTISHMCNTS 2S1 

execution. Once execution has begun, C-A-P-M-A-F demands all the in- 
formation it needs on the remote console typewriter. After each entry by the 
user from the console keyboard the system checks for errors. If an entry* is 
outside an allowable range, is illegally omitted, or the user does not follow 
instructions, the message is repeated. 

C-A-P-M-A-F has two operation modes. The first is used to test the 
readership models by predicting Starch scores for an advertisement that has 
already been published and comparing the predicted and measured scores. 
Operation mode two, the design mode, requests a readership objective func- 
tion and any necessary format or budget restrictions. The system then selects 
and displays the advertisement format that, although conforming to all re- 
strictions, maximizes the objective function. The user may ask to see as many 
as four “runner-up” formats and may then choose to modify the objective 
function, format restrictions, or budget constraint and have C-A-P-M-A-F 
select the new optimal format. 

4,2. Example 

As an example of the use of C-A-P-M-A-F in preparing a magazine 
advertisement format, consider the hypothetical case of the Universal Tobacco 
Company, which is about to introduce a new brand: Puff Filter Cigarettes. 
To introduce Puff nationally (no promotion has yet appeared), it has been 
decided to place an advertisement in Life. The format of this ad will be 
prepared with the assistance of C-A-P-M-A-F. 

We suppose that research in the cigarette market has indicated to 
Universal that their efforts in reaching men and women should he divided 
approximately in the ratio of 55 to 45. The main function of the Life adver- 
tisement is to introduce the name “Puff.” The special characteristics of 
Puff’s “best filter yet” arc to be secondary in this advertisement. It has been 
decided, therefore, to split the 55 per cent for men into weightings of 40 and 
15 for the Starch “sccn-associatcd ” and “read most” scores, respectively, 
in the readership objective function* For women the weightings are set at 
35 and 10 for the same scores. It is intended to examine those formats pro- 
duced with both forms of the readership objective function. Universal has 
no control over three of the 12 readership model variables: Puff Filter Cigar- 
ettes arc in the “tobacco and related products” product class, there has been 
no promotion, and about 50 advertisements are expected for the Life issue in 
question. For the remaining nine variables, in order to give C-A-P-M-A-F 
as much freedom as possible in maximizing the readership objective function, 
only those restrictions that have been decided arc absolutely necessary will 
be imposed: The advertisement will contain two full-color photographs — 
one large and one small— the larger one showing a pleasant scene of a young 
couple enjoying their first Puffs and the smaller showing the construction of 
the filter. The brand prominence is set at its maximum level. 

The readership objective function proposed earlier is used first in its 
second form (readers per dollar). The format restrictions arc then entered. 
N T o budget constraint is placed on the ad. The first modification is to change 



282 


DANIEL S. DIAMOND 


the readership objective function to its first form. Finally, the budget limita 
tion is set at §50,000. The format selected by C-A-P-M-A-F under the final 
set of conditions is the following: 


FINAL C-A-P-M-A-F SELECTION 


SUBJECT TO THE CONSTRAINTS ENTERED EARLIER, C-A-P-M-A-F 
HAS CHOSEN THE ADVERTISEMENT FORMAT DISPLAYED BELOW AS 
MAXIMIZING THE DEFINED READERSHIP OBJECTIVE FUNCTION. 


PRODUCT CLASS 

PAST ADVERTISING EXPENDITURE 
NUMBER OF ADS IN ISSUE 
NUMBER OF PAGES 
NUMBER OF COLORS 
BLEED— NO BLEED 
LEFT OR RIGHT PAGE 
POSITION IN MAGAZINE 
LAYOUT 

NUMBER OF WORDS 
BRAND PROMINENCE 
HEADLINE PROMINENCE 


TOBACCO + RELATED PRODS 

UP TO S449K 
50 

ONE-HALF PAGE (HORIZ)* 
FULL COLOR 
BLEED* 
RIGHT* 
FIRST QUARTER* 
1 LG PHOTO + 1 SM PHOTO 
FEWER THAN 50* 
VERY HIGH 
LOW* 


* SELECTED BY C-A-P-M-A-F 


5. CONCLUSION 

Is the approach suggested in this article a useful method of selecting a 
magazine advertising format or is it merely of academic interest? It has been 
shown that the degree to which an advertisement will be read can be pre- 
dicted fairly accurately on the basis of its format. The practicality of this 
approach depends on the attitude of the advertiser. If he is willing to accept 
advertisement readership as a measure of attention-getting power and he 
considers attention-getting power important to advertisement effectiveness, 
he must consider the technique useful. 



COMPLTFR-AIDED PREPARATION OF M \GA21NE ADVERTISEMENTS 


283 


6. REFERENCES 

[1] Damfl S Diamond, “A Step Toward Computer-Aided Preparation of Magazine 
Advertising Copy,” unpublished bachelor’s thesis, Massachusetts Institute of 
Technology , Cambridge, 1965 

[2] D B Lucas and S H Britt, Measuring Advertising Effectiveness , McGraw-Hill, 
New York, 1963, p 5 

[3] H W Bovd and R WESTFALL, Marketing Research , Text and Cases , rev. cd , 
Richard D Irwin, Homewood, 111 , 1964 

[4] Letter from Stan M Sargent, Senior Vice President, Daniel Starch and Staff, 
February 12, 1965 

[5] DiK Warren Twtdt, “A Multiple Factor Analysis of Advertising Readership,” 
J Appl Psycho /, 36, 207-215 (June 1952) 

[6] Jino Yamanaka, “A Method of Prediction of Readership Score (Newspaper Ads),” 
Dentsu Advertising, Ltd, Tokyo, undated See also, “The Prediction of Ad 
Readership Scores,” J Advertising Res , 2, 18-23 (March 1962). 

[7] Donaid B Suits, ” Use of Dummy Variables in Regression Equations,” J Amer 
Statist Assoc , 52, 548-551 (December 1957) 

[8] CfllMO Havashi, ”On the Prediction of Phenomena from Qualitative Data and the 
Quantification of Qualitative Data,” Ann Inst Statist Math (Tokyo, Japan), 
3, 69-78 (1952) 

[9] Costs have been derived from Time Inc , M Life Rate Card No. 41,” Time Inc , New 
York, effective September 3, 1965 

[10] Leading National Advertisers, Inc , National Advertising Investments, Leading 
National Advertisers, Norwalk, Conn , 1964, 1965. 


PREPARATION DES FORMES DE PUBLICITE DANS LES 
MAGAZINES A L’AIDE D’UN ORDINATEUR 

R£SUM£ 

Alors que la plupart dcs tentative^ faites pour mesurer la publicite sc fondent 
sur la selection dcs media ct la progrnmmation dcs partitions, l’ctudc cn 
question cst basec dircctcmcnt sur Taspcct graphique dc Tannoncc-magazinc. 
L'autcur avancc, tout ePabord, que, pour influcnccr Ic consommatcur, unc 
annoncc doit attircr son attention. II propose dc prendre commc mesure dc 
la \alcur d’attention d’unc annoncc, lc nomhre dc scs Icctcurs. Puis, on 
ctablit un ensemble dc six modules Starch d’audience d T unc annoncc de maga- 
zine au moycn d’unc analyse a regressions multiples, cn partant dc 1070 
annonccs parties dans Life Interviennent dcs \ariablc$ independantes tcllcs 
que lc format dc Pannonce, les coulcurs, Pcmplaccmcnt dans lc magazine, 
etc. II \ cn a dou/c cn tout* Enfin, un programme con\crsationnel pour 
ordmatcur cst claborc qut cxigc dc cclui qui Eutihsc unc definition dc 
Paodicncc-ciblc. unc definition dcs six indices Starch pour Pannoncc a mettre 
au point ct ^on prix. Les restrictions de formats, les contraintcs budgctaircs. 
Lc programme determine alors le format dc Pannoncc qui, tout cn repondant 
a toutes les conditions requires, permet d’atteindre les objcctifs prealablcmcnt 

ftXtk;. 



284 


JAMES B. CLOONAN 


A HEURISTIC APPROACH TO SOME SALES 
TERRITORY PROBLEMS 

line Approche Heuristique de Quelques Problemes 
Relatifs aux Territoires de Vente 

James B. Cloonan 

Loyola University , Chicago , Illinois 
United States of America 


1. INTRODUCTION 

1.1. General 

This article is concerned with several aspects of industrial sales territories 
and the activities of industrial salesmen. Specifically, it deals with the questions 
of “ Who to visit this trip? ” and “ What route to take? ” There are, of course, 
many levels of sales territory problems. Starting at the grossest they include 

— How many salesmen or territories? 

— Where should the boundaries of these territories be? 

— Which accounts to visit on each trip (or how many times a period to call 
on each)? 

— What route to take? 

It is extremely common to find the decision process starting with the broadest 
problem, number of salesmen, and working toward the more specific. The 
approach of this research is in the opposite direction. A decision about the 
number of territories and their boundaries seems dependent on determining 
how much a salesman can accomplish when operating at maximum efficiency 
in a given area. The techniques developed are oriented toward decisions about 
operations in a single territory — in fact, for subdivisions of a single territory. It 
seems likely that output from such a model is essential as input for a model of 
the company as a whole. This article is concerned, then, with the last two of 
the stated problems. 

Examining the two questions of interest — “Who to visit this trip?” and 
“What route to take?” — we find substantial work done regarding each area 
but next to nothing regarding the solution when both elements are present. 

1.2. The Selection Problem 

The first question, which is referred to as the “selection problem,’ is 
simply another view of call frequency and has been referred to commonly as the 
“distribution of effort problem.” A marginal analysis approach can be taken 
to its solution. Set 

MVCi MVC 2 MVC n 

Ti ~ T 2 T n 9 


A HEURISTIC APPROACH TO FOMT SAI ES TERR I TO RV PROBLEMS 285 

where MVC\ , r — 1, N arc the marginal values of sales calls and 7V is the cost 
(in time) of such calls. By setting 

> X t Tt= To. 

i i 

where A'j is the number of calls on account "r" and 7b is the total time avail- 
able, the optimal distribution of effort is achieved. This is not necessarily the 
profit maximisation position of the firm, for To may not be at the optimal 
level, but it is the best solution within the resource restraint. Ivoopman [6] 
and Michle [8] discuss solutions to more complex forms of this problem. 

The difficult) with this approach is that in ordinary territories 7* is inter- 
related and variable. The cost of calls on accounts A, B t and C depends on the 
route taken between them and also on the required starting and finishing points. 
The cost of a call on any account depends on other calls made on the same trip. 
If the variation in call costs were a smoothly changing function, marginal costs 
could be substituted for the T{ and the approach described could still be used. 
The pattern is irregular, however, and a discrete incremental or opportunity 
cost approach must be emplovcd. 

It should be stated that the marginal anntysis solution is valid in those 
circumstances in which TV is constant. These conditions arc met when the 
salesman makes each call from Ins office and returns before making the next 
calk 


1.3. The Routing Problem 

The second problem, often called the "traveling salesman problem,” seeks 
the shortest (least cost) route through a series of points. Although the optimum 
path can be determined by evaluating all paths, this approach is unsuited to 
problems with more than a few points because of the large number of combi- 
nations. Linear programming, integer programming, and heuristic approaches 
have been used to choose the optimum or near optimum path. There appears 
to be a necessary "trade-off” between solution time and closeness to optimum. 
Good or optimal solutions can be obtained in a number of ways [1, 3, 4, 5, 7, 9]. 

Unfortunately, the traveling salesman problem does not, despite its name, 
relate to the real problems of most traveling salesmen. Because of the different 
value of each account a salesman will not call on all of them each trip. This 
means that a solution to the problem is necessary for every subset of accounts. 
Computer time requirements for current solution methods makes this approach 
infeasible with any quantity of accounts. 

1.4. Model Requirements 

In review, the state of the art provides a marginal anal) sis approach to 
optimising sales effort that is not practical in those cases in which the calling 
costs fluctuate and various approaches to the minimization of tour costs that do 
not apply directly in those cases in which the frequency of sales calls varies. 

The limitations of present solutions to selection and routing problems are 



286 


JAMES B. CLOONAN 


based on the failure to consider them both at the same time. What is desired 
is to devise a “tour” (accounts to be called on and the route to take) that will 
in the available time, give greater value than any other tour. ’ 


2. THE MODEL 


Only industrial sales territories of a size in which travel costs neither dominate 
nor are dominated are studied. All accounts (present and potential customers) 
are known. The short-run solution based on available effort is sought, although 
solutions in different territories would indicate the need to expand or contract 
the sales force, thus pointing toward long-run optimization. The unit under 
consideration is the sales tour. A sales tour is defined as a set of customers a 
salesman does, or could, call on in a single trip from his base location and 
back. Although such a tour could take in an entire sales territory, it usually 
represents the fraction of a territory that can be covered in five to ten days, thus 
giving the salesman ever)' weekend or ever)' other weekend at home. Because 
a territory is the sum of these sales tours, the model of a territory' or the entire 
market can be based on them. The model, of course, must ultimately consider 
the balancing of tours and territories as well as the assignment of accounts to 
each tour and territory. 

Following are the terms contained in this model. The heuristic is described 
in Figures 1 and 2. In all matrices and vectors accounts are numbered in the 
order of their appearance in a “ tra%'ding-salesman ” solution for the entire set 
of accounts. Because FORTRAN was used for the programming, some of the 
following symbols have alternate prefixes to denote their integers or real natures. 


N 

TT (N x N) 
CT (N x 1) 
JE (N x 1) 

G (N X 12) 
GMAX 


VO 

TN 

TR 

TC 

X 

Y 
Z 

VN 

OT 

VR 

W 

V 

SVN 


Number of accounts in tour. Origin is account 1 and N. 
Matrix of distance between accounts expressed in time. 
Duration of a sales call on each account. 

Time (months) since last call on each account; 12>JE>1. 
Value of a call on any account as a function of JE. 

The maximum value of G (in this model G is max when 

J E = 12). 

A critical value ratio established by management. 

The accumulated value of sales calls. 

The accumulated travel time. 

The accumulated calling time. 

Present location of salesman. 

Account being evaluated for the next call. 

The account after Y used to establish opportunity cost 
The value f(G) of a call on account Y. 

Opportunity (incremental) cost of going to Y on the way to Z, 
The ratio VN/(OT + CT). 

Equated with Y and used in the subroutine in place of Z. 
The account to be evaluated in the subroutine. 

The value f(G) of a call on account V. 



288 


JAMES B. CLOONAN 



Figure 2. Flow diagram (subroutine). 


The various elements of the model are discussed later in the article. The 
main approach, as shown in the flow diagram, is to evaluate the cost of a call on 
any account in terms of opportunity cost, which is measured as the difference 
between the calling and travel time involved in going from the present location 
(X) to the account under consideration ( Y ) and then to the next account (Z), 
as opposed to the travel cost of proceeding immediately from X to Z. Accounts 
arc numbered in their order of appearance in a solution to the traveling-salesman 
problem for the entire set of accounts. The assumption made in establishing the 














A HEURISTIC APPROACH TO SOME SATES TERRITORY PROBLEMS 2S9 

route is that when an account is dropped from the traveling-salesman solution the 
remaining path is a good one. This assumption, although not always true, 
eliminates the need for constant recalculation of the traveling-salesman solution. 

The geographical distribution of accounts is not random but is related to a 
number of factors such as transportation, access, and geography. In an exami- 
nation of the routes between towns and various potential sales territories it 
appears that the assumption made will hold in the majority of cases unless a 
great number of accounts arc dropped. This is a distinct possibility in those 
territories that support a large number of smaller accounts, and thus the heuristic 
program may give a less than optimal solution. For this particular situation a 
better solution could probably be derived by recalculating the traveling-salesman 
solution by omitting the account being evaluated (K). Subtract the tour costs 
of the subset from that of the complete tour and use the result as the opportunity 
cost. If the account was not to be called on, the restructured solution would be 
used. The cost of the calculations makes this approach impractical as an alternate. 

3. PROGRAM ELEMENTS 

3.L G 

The 11 value” of a call on an account could be described as the present value 
of the stream of future profits accruing from the call. More rigorously, it could 
be the present value of the possible streams multiplied by their appropriate 
probabilities. Practically, this stream is difficult or impossible to calculate. 
In addition, the account itself is not related to many of the determinants of 
profits or even of gross sales. It is more realistic to rate accounts on a relative 
basis, independent of company profitability or general sales polity. It is much 
easier to estimate that account A is worth twice as much as account B than to 
predict the present value of future profit possibilities as $100,000 and $50,000. 
In practical application the relative value of accounts for salcs-cal! purposes is 
based on such factors as present sales volumes, profit and mix, potential future 
volume and its profitability, service, public relations, promotional tie-ins, and 
historic relations between companies. 

Management determination of account categories should go beyond a nominal 
or ordinal scale into a ratio scale that indicates relative importance. Although 
numeric rankings of various companies arc taken as a given input in this model, 
it must be emphasized that the actual assignment of such values is quite sig- 
nificant. In any company, however, they arc given to every account either 
consciously or subconscious!}* and arc reflected in the treatment of each. When 
the assignment of values is not overt, communication becomes another source 
of error. If the sales manager docs not expressly state that account A is worth 
three times ns much as account /?, his general indications of preference may be 
interpreted by the salesman as a 2:1 or a 10:1 ratio. Brown, Hulswit, and 
Kcttclle [2] and Waid, Clark, and AckofF [30] have discussed some of the 
problems of account evaluation. 

The eventual rating given a particular company becomes its maximum value 
of G. The actual value of G at any time is perceived as a function of GMAX 



290 


JAMES B. CLOONAN 



• Time since last call (JE) 


Figure 3. Relationship of value 
to elapsed time. 


and elapsed time (JE). Although the shape 
of the function relating the value of a call 
and the time since the last call is not precisely 
known and, in fact, would be expected to 
range considerably, depending on the indus- 
try, a function of the form in Figure 3 i s 
visualized. 

The dotted segment before a indicates 
the possibility that too frequent calling may 
annoy the customer and give zero or nega- 
tive value. (If present sales were the only 
determinant of value, negative values would 
be impossible.) The dotted portion after b 
indicates that after some period of time the 


customer will have forgotten the salesman sufficiently so that nonproductive 
effort will be necessary just to renew the formal relationship. In this research 


only the area from slightly above a to b is considered relevant, and the para- 
bolic function 


GMAX/— JE 2 4JE 

G ~~ 10 \ 18" + T~ 



is used as an estimate. The actual function would be more complex, but the 
exact shape is not important to the model, and any functional relationship 
established by empirical data for a particular industry could be substituted in 
the program. 


3.2. Costs 

All costs are calculated in time units; however, calling time costs (CT) 
and travel costs (TT) have been separated and are given different weights in the 
preliminary calculations of value. Variable traveling costs (fixed costs are not 
considered in the program) could be divided into two separate categories, the 
cost of the salesman’s time when traveling and automobile expense. There is a 
high correlation between them and in normal circumstances no necessity is 
seen for separating them, but if there are a great number of alternate routes on 
high-speed highways or tollways in any territory, it may be desirable to do so. 
In this program automobile costs are fixed at 0.4 of the cost of the salesman’s 
time, and so TT is weighted 1.4 of the cost in time. Once again this ratio could 
be adjusted to any particular industry or territory on the basis of empirical 
evidence. The figures in this case were based on a salesman’s hourly cost of 
$10.00 — 8 cents a mile travel cost and 50 mph average road speed. 

3.3. VO 

VO is the “ go-no go” criterion level against which VR is measured. In 
the long run, hopefully, VO will represent the ratio of marginal productivity to 
cost for all other promotional activities as well as for the manufacturing activities 
of the firm. In the short run, however, VO is set so that the salesman will just 
use his available time. 


A HFUMSTIC APPROACH TO SOME S\LE> TERRITORY PROBLEMS 


291 


In practical application there is the possibility that at some level of VO no 
calls at all will be made, and when the level is set one increment higher it wall 
specify calls that will take longer than the available time. This situation has not 
developed in any sample territories to date. 

3.4. Subroutine(s) 

In evaluating the opportunity cost of a call on any account, the alternative 
route used as a basis for this decision is the next account in sequence. If, later 
in the program, this next account is dropped from the tour, a re-evaluation of 
previous decisions must be made. When a posime decision to call on an account 
has been made, following a number of dropped accounts, the subroutine moves 
backward to re-evaluate each of the dropped accounts in terms of the known 
future destinations. To consider all possibilities a series of identical nested 
subroutines is necessary. Although as many as (.V — 2)/2 nested subroutines 
could be used, two seems to be adequate in most cases. 

It is interesting to note that the problem the subroutine seeks to solve is 
also a problem in the alternative program that would attempt to determine 
opportunity costs by constant recalculation of the traveling-salesman solution 
which omits the accounts being evaluated. In addition to the large amount 
of computer time needed for this approach, it is also true that subsequent 
elimination of accounts would require a subroutine that would return and 
recalculate previous decisions based on the elimination of accounts 41 down the 
road.” 


4. FINAL CONSIDERATIONS 
4.1. The Rest of the System 

Although the problem being discussed may stand as a distinct segment, for 
purposes of analysis it is a part of a more general company model. Segmen- 
tation is a natural result of problems too complex to be considered simultaneously, 
and singular examination of isolated parts of the total company model is generally 
a necessary' step. As pointed out at the beginning of this article, however, there 
arc other levels in the problems of sales coverage of national markets. Once 
a company has established optimum operations in the micro-sections of the 
market that have been considered here, it is in a position to build on these 
basic units — territories, districts, regions, and total market. These further 
steps, like the first, involve analytical problems that must be solved, but it seems 
better to build the pyramidal market structure from the base of “sales tours” 
than from the top down. 


5. SUMMARY 

This article has described a heuristic program to sohe the problems of account 
selection and routing simultaneously. It is the contention that the approach to 
a solution of company territory problems properly begins by considering each 
salesman and then building up into districts, regions, and the total market. The 



292 


JAMES B. CLOONAN 


opportunity cost approach presented is currently being evaluated by compm 
simulation of sales territories. The necessity for constant revision caused !r 
changes in territories and the great number of territories has restricted the 
approaches. More elegant (theoretically) programs that require hours of com 
puter time for the solution of each problem have been passed over as impractical 


6. REFERENCES 

[1] L. L. Barachet, “ Graphic Solution of the Traveling-Salesman Problem M 

ations Res ., 5, 841 (1957). ’ p 

[2] Arthur Brown, Frank Hulswit, and John Kettelle, “A Study of Sales Oner, 
ations,” Operations Res., 4, 296 (1956). 

[3] G. Dantzig, D. Fulkerson, and S. Johnson, “ Solution of a Large-Scale Travel ir* 

Salesman Problem,” Operations Res., 2, 393 (1954). ' k 

[4] G. Dantzig, D. Fulkerson, and S. Johnson, “On a Linear-Programming, Com. 
binatorial Approach to the Traveling-Salesman Problem,” Operations Res 7 
58 (1959). 

[5] Robert Karg and Gerald Thompson, “A Heuristic Approach to Solving Travel. 
ing-Salesman Problems,” Management Science , 10, 225 (1964). 

[6] Bernard Koopman, “The Optimum Distribution of Effort,” Operations Rn 
1, 52 (1953). 

[7] John Little, Katta Murty, Dura Sweeney, and Caroline Karel. “An Al- 
gorithm for the Traveling-Salesman Problem,” Operations Res., 11, 9 72 (1963), 

[8] William Miehle, “ Numerical Solution of the Problem of Optimum Distribution 
of Effort,” Operations Res., 2, 433 (1954). 

[9] C. E. Miller, A. W. Tucker, and R. A. Zemlin. “ Integer Programming For- 
mulation of Traveling-Salesman Problems,” J. Assoc. Computing Machinery 7, 
326 (1960). 

[10] Clark Waid, Donald Clark, and Russell Ackoff. “Allocation of Sales Effort in 
the Lamp Division of the General Electric Company,” Operations Res., 4, 629 
(1956). 

UNE APPROCHE HEURISTIQUE DE QUELQUES PROBLEMES 
RELATIFS AUX TERRITORIES DE VENTE 


RfiSUMfi 

Plusieurs importants aspects des problemes relatifs aux territoires de vente 
ont ete traites par des methodes de Recherche Operationnelle; cependant, une 
methode integree et tenant compte des contraintes reeles n’a pas encore ete 
etablie. Le probleme du “ Commis Voyageur,” bien que s’addressant aux 
questions d’itineraire, ne tient pas en consideration les facteurs determinants 
s’appliquant aux visites des clients par le representant de commerce. De memc, 
F application de la methode optimale de repartition des efforts aux problemes 
du representant de commerce ne tient pas compte des questions d’itineraire. 
Les conditions optimales pour une analyse marginale peuvent etre etablies mats 
elles sont depourvues de sens a moins qu’elles ne soient fondees sur les foxtf 
resultant d’une organisation optimale des visites aux clients. 



ISSUING AND PRICING POLICIES FOR SEMYPERI5IHBI.E5 


293 


Cette etude decrit unc programme heuristique pour I’organtsation dcs 
\ i^ites dc clients par 1c representante de commerce, bascc sur dcs considerations 
d'itincraire ainsi que sur celles avant trait a la “valeur” et les frais dc ces 
visiles. Ce programme determine la selection de la visile suivantc cn fonction 
de la preseme situation geographique, de la rriatrice dcs distances entre clients, 
dc la solution classique du problemc du “Commis Voyagcur” pour Tensemblc 
dcs clients, dc la periodc ecoulee depuis la demiere visile de chacun des clients, 
dc la “\aleur” d’unc visitc a chacun dcs clients, d’une fonction Iiant ccttc 
valeur a la durec de la periodc ecoulee entre visites, dcs frais de depheement, 
ct d’un rapport critique entre la valeur increment ct les frais increments. 


ISSUING AND PRICING POLICIES 
FOR SEMIPERISHABLES 

Politiqucs pour la Distribution et la Determination 
du Prix des Denrccs Semi-perissables 

S. Eilon and R. V. Mallya 
Jmpcria! College of Science , London , United Kingdom 


1. INTRODUCTION 

Semi perishables are defined as goods that sustain their full value (or price) 
for a given period but lose some of this value at the end of that period. Thus 
the value function of the goods assumes a decreasing stepwise form, the drops 
in value (or steps) occurring at prespccificd intervals. After m such intervals 
the goods have to he scrapped. 

At any one time there may be ttt different classes of the commodity 
in stock. For the case in which m —2, for example, there arc two classes: one 
of fresh goods which were added to the store at the beginning of the current 
period, the other of old goods which arrived in the store during the preced- 
ing period. If another period is allowed to pass, the old goods of the present 
period will be scrapped and the present fresh goods will become old goods. 
If a new hatch arrives, it will be labeled fresh goods. 

Because of price differentials between the various classes, there may be a 
distinct demand for each during any given period. These demand figures are 
naturally not independent, for the various classes are in essence competing 
with one another in the market. The sales price of fresh goods is often fixed 
by market conditions and competition of other similar fresh goods, and the 
price structure for fresh goods will then largely determine the potential 




294 


S. EILON AND H. V. MALLYA 


demand for this class. This potential demand may be affected by the price 
quoted for second-class goods. The lower it is in relation to the price of fresh 
goods, the more customers for fresh goods it is likely to attract, and the more 
genuine demand for second-class goods it will generate from customers who 
w r ould not have considered purchasing fresh goods. 

In marketing semiperishables, policies must be formulated for the follow 
ing problems: 

1. For given price differentials and given demand data for the various 
classes, what issuing policy should be adopted? Three alternatives are 
analyzed in this paper: 

(a) LIFO policy (fresh goods are issued first, older goods are offered for 
sale only when the class of fresh goods is depleted). 

(b) FIFO policy (old goods are issued first). 

(c) A mixed strategy, when fresh and old goods are issued according to 2 
predetermined ratio. 

2. What price differential should be adopted? 

3. What reorder quantity of fresh goods should be specified? 

The objective is assumed to be maximization of revenue from sales 
over n periods, when from the nth period onward the stock is no longer 
replenished with fresh goods. 


2. GOODS WITH A LIFE OF TWO PERIODS (ms 2) 

2.1. The Two-Period Case ( n = 2) — Model 1 

The model for the two-period case [1] is described as follows: 

1. A pile of Qi fresh units arrived at the beginning of the current period; 
the price for fresh goods is p\ per unit. 

2. Qz units are left over from the preceding period; the price for old 
goods is p 2 per unit (where pz<p\). 

3. The demand figures for the current and the next periods (denoted 
as periods 1 and 2) are d\ and d 2 , respectively, and are assumed to be 
deterministic. 

4. If x\ fresh units and y old units are sold during the current period 
(where x\<LQ\ and y<^Qo), at the end of the period there will be 
leftovers, which will become old or second-class goods to be sold next period 
at p 2 per unit; Qz —y will have to be scrapped, the scrap value being pt—fyu 
The number of units sold in period 2 is # 2 - 

5. The stock will not be replenished in the next period. 

If d\ > Qi 4- 02 , the total stock will be depleted and the total revenue 
R is given by 

0 ) 


R =p 1 Q x -L-p 2 Q 2 



ISSUING AND PRICING POLICIES TOR SI'MIITRISIIABLES 


or, if we substitute 


r ' />»’ 



the nondimcnsiona! revenue function is 


295 

( 2 ) 

(3) 


r =0i - 4 - />£?.;• 


(4) 


However, if <Q\ 4 02, t ^ c revenue from fresh goods is /)j x\ and from 
old goods pzy (in period 1) - pzx« (in period 2), so that 


r ~ x { p(x» +J-) + s(Qz -y- 1 Qi- .vi - .v 2 ), 
subject to the constraints 

*i 4 *2 < Qu 

y < 02 1 

4 y<du 

X 2 < 


(S) 


( 6 ) 


and the condition that nil these quantities must be non-negative, The revenue 
function r is affected by the issuing policy, which determines the values of 
X'U a* 2 , and y. Results for r are given in Appendix 1 for LIFO, for FIFO, 
and for a mixed strategy (which calls for issuing old goods and fresh goods 
in predetermined proportions). The results arc tabulated for 3 eases: 


(a) 

Q 

(l>) 

d\ Qi < <■/ 1 + d‘2 

(c) 

Qi <d x . 


It is riot difficult to sec that for the first ease LIFO is better than FIFO; in 
the other two eases LIFO is optimal, provided that />< *(1 -{-x); otherwise 
FIFO is generally optimal, except in ease (h) when the mixed strategy yields 
a higher revenue for the additional condition 0j -r Qz > d\ -f d * . 


2.2. The n-Period Case (n > 2) — Model 2 

The model here is essentially as stated earlier, except that the stock is 
replenished with a fresh stock Oj at the beginning of each period for 
n - 1 periods, after which the line is discontinued. In the nth period we may 
have some leftovers from the preceding period, and this stock of second-class 
goods is then used to meet the demand in that period, but no further sales 
are made in the (« - l)st period. Stock that assumes the age of two periods 
is scrapped, so that at any time we ha\c only two classes of goods: first class 
(fresh) and second class (leftovers from the preceding period). The demand 
d { for period i (where i — L 2, ri) is deterministic and assumed to he 
known in advance. 



296 


S. EILON AND R. V. MALLYA 


The results for this model can be summarized as follows: 

1. If Qi>d t , where i=l, 2 .... (n-l) and Qi>d n ^^d n , LIFO 
is optimal. 

2. If either of the conditions stated above is not satisfied and A 

LIFO is optimal, but if p>i(l + s) a mixed strategy is generallylndicated 

3. In the latter case the mixed strategy is chosen as follows* if 
(j - 1+ s)jj <p<(j + s)l(j + 1), where j is any integer, then for the first 
(n — j ) periods we should adopt LIFO, but FIFO for the last. For the special 
case j = n this rule calls for a FIFO policy. Note, however, that FIFO will 
yield a maximum revenue when 

( n ~~ i )Qi + 02 = 2 di . 

If Q 2 is higher than the level required in this equation, the balance 
Qz — — l)£?i] °f second-class units should be scrapped or dis- 

pensed with forthwith before depletion starts. 


3. GOODS WITH A LIFE OF m PERIODS (Model 3) 

The general case of semiperishables can be described as one in which replen- 
ishment takes place every period for n — 1 periods, the goods have a life of 
tn periods (after which they have to be scrapped), and maximum revenue is 
sought over n periods. The demand d\ in period i is known, and the prices 
per unit are pi for fresh goods, p 2 for second class, p$ for third class, and so 
on; goods that have an age of m — 1 periods are priced at p m per unit, and 
after m periods their scrap value is^ s per unit (where > ^> 2 > * ' * >pm'>pt\ 

An analysis of the revenue obtained for the three issuing policies yields 
the following results: 

LIFO should be adopted throughout if 

pi + pk + 1 >P 2 -\~pk , (7) 

where k = 2, 3, . . . , m , and 

pm + 1 —ps- 

FIFO is optimal if these conditions are reversed, provided 

(» 

Clearly this condition becomes increasingly difficult to satisfy as n increases, 
thus making FIFO increasingly unsuitable as a pure strategy. When neither 
the conditions for LIFO nor those for FIFO are satisfied, a mixed strategy 
is called for. 


4. A STOCHASTIC MODEL (Model 4) 

In the models described so far the demand was assumed to be deterministic. 
Let us now re-examine the two-period case of model 1 but assume that 
demand is normally distributed with a mean d and standard deviation o. The 



ISSUING AND PRICING POLICIES FOR SEMIPH2USHABLES 


297 


starting stock levels in period 1 arc Oj units of fresh goods and Q* of second- 
class goods. 

The probability that the demand will not exceed Q t is denoted by <I>i, 
which is the cumulative normal distribution from 0 (the area under the 
normal curve from — cc to 0 is assumed to be negligible) to the normal 
deviate ri, where ~ (Oj — )/cr ; similarly, the probability that the demand 
will not exceed Qz is denoted by <I >2 and that for not exceeding 0\ -f Qz L 
<l> Tt where the normal deviate T = (Oj — Qz — J)>a. 

The revenue R for a LIFO issuing policy is computed as follows: 

L When the demand in period 1 is ^i > Oi -f Qz, we have 


(piQi —Q*t) 

and no sales arc made in period 2. 

2. When Q\ < d\ < 0\ — Qz , we have in period 1 

[p\Q\ -r pz{£>z — £?i)](d>r — ^i), 

where Dz is the mean demand level of the double truncated normal distribu- 
tion between the deviates t\ and T\ if the normal deviate corresponding to 
Dz is tnZi then 

<£r — 6 1 
and 

Dz = d ~ Oij-iZ 

(6t and arc the values of the standardized normal probability density 
function corresponding to the deviates T and tu respectively). 

3. When di <Q u the expected revenue in period 1 is piDfoi , where 
Di is the mean demand up to Oi and is obtained from 


D\ J *w i 


where trii is the normal deviate that corresponds to D\ and can be 
found from 


Now, the stock 0\ ~ D\ is carried to period 2. If the probability that the 
demand in the second period will not exceed this stock is denoted by <l> 3 , 
corresponding to the deviate h ~ (Q\ — D\ — J ). ; a, and the mean demand 
up to Qi - Di is Dz, the revenue in period 2 is 

PzD^z^pziQi-DiYJ 



298 


S. EILON AND R. V. MALLYA 


where D 3 can be computed in a similar way to D\. Thus the total expect d 
revenue function r for LIFO is 

r = = (01 +pQz)(l ~ ®t) + [01 +p(D 2 — 0i)](Oy — Oi) -f d 1 $ 1 

+ [pD% $3 + ^(01 — £>l)(l — <£ 3 )]®! 

+ s [(0i + 02 — O 2 )(®r — 0 1 ) + 02 

+ (0i — Di — £>3)®3®i]- pj 

The last term in this equation denotes the value of scrapped goods. Expres- 
sions for the revenue for a FIFO issuing policy or for the mixed strategy can 
be similarly obtained, and the method can be extended to n >2 time periods 

Table 1 

Expected Revenue — a Comparison Between Models 1 and 4 


P=0A p = 0.7 


Model 1 Model 4 Model 1 Model 4 

(deterministic) (stochastic) (deterministic (stochastic 


(1) 

0i = 12O; 02 

LIFO 

= 100 

108.0 

107.67 

114.0 

114.67 


FIFO 

80.0 

82.65 

140.0 

138.64 

(2) 

0i = 100; 02 

LIFO 

= 80 

100.0 

98.40 

100.0 

103.20 


FIFO 

84.0 

83.66 

132.0 

130.00 

(3) 

01 = 80; 02 = 
LIFO 

= 60 

88.0 

87.60 

94.0 

94.55 


FIFO 

80.0 

79.92 

110.0 

109.86 


Notes. 1. In the stochastic model we assumed a normal distribution with J=100, 
a = 20. 

2. Scrap value p s = 0. 


Table 1 gives an interesting comparison between the results for models 
1 and 4 , from which we find that differences between the revenues in each 
of the six cases quoted in the table are very small. This would suggest that 
model 1 could serve as an adequate approximation, particularly when the 
object of the study is to determine which issuing policy to adopt. 


5. OPTIMAL REORDER QUANTITY 

This analysis can be extended to determine the stock levels and reorder 
quantities at which the revenue function is expected to be maximum. The 
results are given briefly: 

model 1. Maximum revenue occurs when 0i=*/i+^2 and Qz = 0. If 
■here are leftovers (£2 7 ^ 0 ) as a result of previous decisions, take Q\ = di+dz 
md use LIFO. 



IDLING 1ND PRICING POUC1ES FOR :-EMIPZ»I«H VELES 299 

mgdfl 2 Take Q \ -=- di and ~ 0, but the la*t reorder quantitv should 
be pi = d n ~ i — d r . Again, if Q: - 0, use LIFO* 

mod hi 3. The optimal \aluc of pi as in model 2. The best value of 0>, 
Pa, Qn i s zero (if not, use LIFO) 

vodfl 4. The rev enue l'* given bv (9) For the ca*e in which 5 = 0 and 
LIFO is used, we can subtract the acquisition costs of c per unit and 
differentiate with respect to Qi to obtain the solution 

I — fli — pf(l ~ — { ( Fr — — t\6i) — 6i(6r — 6i)] — c. 

Other cases can be further explored using dynamic programming methods. 


6, THE PRICE DIFFERENTIAL 


So far we have assumed fixed prices for the various classes of goods, but 
management may wish to change price differential* in order to affect market 
demand and thereby the total re\enue 

Take the two-period case in model 1. The price of fresh goods p\ may 
well be determined by market condition* and the prices of competing goods, 
but management may ha\c some freedom in deciding on the price level />> 
for second-class goods. Furthermore, the issuing policy mn\ affect market 
demand. If LIFO is used, the demand during the first period i* d\ , hut if 
FIFO is employed with a price pz<pi the demand may be stimulated to n 
level ctd\ at which the factor a> 1 depends on the price ratio />. 

If we follow the notation used for model 1, we shall have to substitute the 
following condition for (6): 

•vi ~-<di 


and 

subject to 


x? <f d 2 , 

*i — *2<; Pi , 

y<Q2. 


All quantities must be non-negative and the revenue function is then given 
by (5). 


As in model I, wc can distinguish between three ca*es: 


L Qi^d }T d 2 . 

It i* not difficult to sec that LIFO is optimal if 


( 10 ) 


where TIFO L optimal when the inequality sign i* reversed. 

2. d\ < pi < di ~ d*. 

Here LIFO is optimal if 


I - 52. 



300 S. ETLOX AND R. V. SIALLTA 

a mixed strategy is called for when 


1 -\-sx 
1 -{-« 


<£< 


i 

a 


and FIFO becomes optimal when 


3. Qi < di. 

LIFO is optimal if 


P> 


1 -r^(g ~ 1) 
a 


P< 


1 “-5a 

TT ^ 5 


(12j 


(15) 


(H) 


and FIFO is optimal when the inequality sign is reversed. 

If vre extend the analysis to 7z periods (where ?z > 2) for ra >2, it can ^ 
shown that LIFO is optimal if 


a* < 


—pk+l ’ 


(15) 


where k — 2, 3, m and pm-i^ps- (The meaning of at is interpreted 
as follows: if in any period in which the potential demand for first-efes 
goods is di we offer an older good of class £ at 2 price pt<pi, the d eman d 
will be stimulated to atd\I) 

FIFO is optimal if the inequality sign is reversed and if 


Pi <pz 


pm^mOL -2 n ' 


- a-z - 


- a9~ — ' * 


4- rrnK-2 * 


(16) 


When a’s are equal to 1, the equations become those mentioned for models 
1, 2 and 3. 


7. CONCLUSIONS 

The analysis in model 1 shows that the decision to issue by FIFO, LIFO, or 
use a mixed strategy depends on the relationship between the demand values 
and the available stocks and often on the price differential. It h as also been 
shown that the analysis can be easily extended to cover tz> 2 periods, in 
which the life of the c omm odity is 771 > 2. Comparison between the deter- 
ministic demand model and that of stochastic demand suggests that the 
former provides a fairly good approximation for the latter. The models dis- 
cussed in this article can be usefully explored to determine the optimal pnee 
differential that should be charged in order to maximize revenue. 


8. REFERENCE 

[1] S. Eicon, R. I. Haix, and J. R. King, Exercises in Industrial Mcragersr.. v - 
Series cf Case Studies {Case 16), Macmillan, London, 1966. 




301 



302 


THEODORE E. HLAVAC, JR. AND JOHN D. C. LITTLE 


POLITIQUES POUR LA DISTRIBUTION ET LA 
DETERMINATION DU PRIX DES DENREES 
SEMI-PERISSABLES 

R£SUM£ 

Lots de la vente de marchandises semi-perissables, les marchandises fraiches 
font concurrence aux marchandises defraichies et cette concurrence depend 
du prix des classes diverses des marchandises. II faut done formuler un plan 
pour les problemes suivants: (1) pour une difference de prix et une demande 
de marchandises quel plan de la distribution portera le revenu au maximum? 
(2) a quelle difference de prix? (3) combien de nouvelles marchandises fraiches 
doit-on commander? Cette etude discute ces problemes et quelques solutions 
sont proposees. 


A GEOGRAPHIC MODEL OF AN URBAN 
AUTOMOBILE MARKETf 

Un Models Geographiqne d'nn Marche 
Automobile Urbain 

Theodore E. Hlavac, Jr. 

Stop and Shop , Inc . 

and 

John D. C. Little 
Sloan School of Management 

Massachusetts Institute of Technology , Cambridge , Massachusetts 
United States of America 


1. INTRODUCTION 

When a person buys a car, he takes into account, either explicitly or implicitly, 
the distance he must travel to the prospective place of purchase. He is likely to 
be attracted to a nearby dealer because of easy accessibility for shopping trips 

t This work was supported in part by Project MAC, an M.I.T. research program 
sponsored by the Advanced Research Projects Agency, Department of Defense, un er 
Office of Naval Research Contract Number Nonr-41 02(01). Part of the research 
done at the M.I.T. Computation Center. 



A GFOGRAPHICAL MODFL Or AN tKBW AUTOMOBILE M VRKFT 


303 


and future service visits. He is less likely to be attracted to a distant dealer, 
but some attraction will exist, especially when a uell-adv eroded dealer Ins an 
image of low price and high throughput. In an urban market, in which there 
is considerable choice of dealers, the probability of purchase from a given dealer 
can he expected to fall ofT with distance; in fact, such a relationship is easily 
established empirically. 

In this article we develop a model in which a customer's probability of pur- 
chase at a given dealer is affected by dealer location and customer-make prefer- 
ence, as well as the locations and strengths of all other dealers. Aggregation of 
the customer model gives a dealcr-markct-share (penetration) model, which may 
also he viewed as a model of competitive interaction. This model is fitted to 
data for metropolitan Chicago. After fitting, the model permits estimation of 
the sales of a dealership with specified strength and location. 

The most obvious practical use of the model relates to market strategy for 
new dealerships in the automobile industry*, but it appears to be adaptable to 
site-location problems in other fields as well. 


2. MODEL 

2.1. Dealer Pull 

To some extent all dealers are attractive to a person planning to buy a car. 
The attractiveness of a specified dealer presumably is a function of dealer 
characteristics (such as make of car sold, extent of advertising), buyer character- 
istics (such as make preference), and distance from dealer to buyer. The attrac- 
tiveness of a dealer is called his “pull.” Pull is not a directly observable quantity 
but is used to develop expressions that arc. 

We shall assume that the number of car purchases is fixed for the period 
under consideration. We have not modeled the effect of a dealer in creating 
sales that would not have occurred in his absence. Our data do not stem to 
lend themselves to the estimation of this effect, and perhaps it is small in today's 
well-developed markets. 

Buyers arc separated into market segments. The pull of a dealer on a buyer 
in a given segment is broken into two parts: (a) an intrinsic pull independent of 
the make sold by the dealer and (b) the make preference of the buyer. Let 

g(i,j) - the pull of dealer j on a hmer in market segment i: $ ~ L . S , 

1 O. 

h{i,j) ~ the intrinsic pull of dealer j on a buyer in segment j, 

q(i, m) -“the make preference of a buyer in segment i for make m. in — l t 
. .. , M . We specify that q(t\ /»;) > 0 and t q(t\ m) - 1 . 

Let ih{}) denote the make sold by dealer j. We stipulate that these quantities be 
related by 

?('.;) -H'-rivV- '</)]• 0 ) 

Thus the pull of a dealer on a buyer is the dtalcrV intrinsic pull weighted 
by the buyer's brand preference. 



304 


THEODORE E. HLAVAC, JR. AND JOHN D. C. LITTLE 


2.2. Purchase Probability 


The probability that a buyer will purchase from a given dealer is taken as 
the pull of that dealer on the buyer, divided by the total pull on the buyer. Let 
p(i,j) be the probability that a buyer in market segment i purchases at dealer; 


tfaj) = 


Zf-1 &ky 


(2j 


We note that make preference can be interpreted as the probability of purchase 
of the make under the conditions that the sum of the intrinsic pulls on the buyer 
is the same for each make. This result can be deduced from (a) and (b). 


2.3. Geographic Effect 

We hypothesize that pull falls off exponentially with the distance between 
dealer and buyer. This is only one of many possible relations but it appears to 
work well. Let x(i,j) be the distance of the buyers in market segment / to 
dealer j : 

h(i,j) = a j <r b J*< { ’1K ( 3 ) 

Here dj and bj are constants specific to dealer j . The constant aj expresses the 
dealer’s strength in his own immediate neighborhood. The constant bj tells hoy- 
fast his sales fall off with distance. Using (1), (2), and (3), we get 

... .. gfi 

PK 1 ’]) j ' ' 


2.4. Dealer Sales and Penetration 
Let 

N(t) — number of buyers in market segment i (called the potential of the 
segment) in a given time period, 

s(J) = expected sales of dealer j in the given time period, 
tt(j) = expected penetration of dealer j in the whole city. 

Then 

S(j)= I (5) 

i=l 


*U) 


*(;) 

i m 


( 6 ) 


3. FITTING THE MODEL TO DATA 

3.1. Data 

The model has been fitted to R. L. Polk’s new car registrations for April, 
May, and June 1963 in Cook County, Illinois (fleet sales eliminated). In order 
to maintain reasonable sample sizes with three months of data, the analysis has 





\ a e F 


lr\%\ 



-/uTI^ lL 

e. 1 


<* e 

l L_; ~e 

T 

\ 

1 — r 

V\ 1 


.2., \\ 

CO \ 

1 1 d / 13 ^ 


|iTA V 


i e \l — 


1 \ !1 ^ 


c\p O 

J jjir 


9 9 oefe 

— 

sL_. =\ 

▼ 

br- 3 a » 

7f 

& __ 

_^L_ 


c Ci ll 


®P :i a 


▼ Buie* 

® Chevrolet 
O Do d^e 
O Fed 
e O'dsmcb^e 

* pljrnouth 

* Pontiac 

* Rambfer 


e> 


TS o IS t\ 

II — l O 27 1 *frt\ 

S ) H ts ti J i 

. £5 K *C fcsC * U 

r 6 °V s e/ m\ 

« ir > -fe - 


W 




\ll 

To 7^ 

~J v / 

1 

1 1 

Ju 

f 

4_ 

V 

_r 



=7 

% 

o 

S 


i * 

, _ _ „.«« ; nto v .hich Cook County. Winch, 

arc 1. The W market ar ^ dealers considered. 

ded show the locations of the IS- ncA c ‘ GC * 



306 


THEODORE E. HLAVAC, JR. AND JOHN D. C. LITTLE 


been confined to eight major makes (Ford, Chevrolet, Rambler, Pontiac P] 
mouth, Buick, Oldsmobile, and Dodge) and to dealers selling 80 or more can 
in the time period. This involves 47,670 cars and 184 dealers, or 91.5 percent 
of the cars and 62 per cent of the dealers of the makes considered. The county 
has been divided into 140 marketing areas or cells . The rationale is discussed 
by James [1]. The location of each dealer has been determined, as have been "the 
sales of each dealer in each cell (Figure 1). 

The market has been segmented by geographic area, that is, by cell. For 
make preference we have used market share. Because the city is laid out rect- 
angularly, distance has been measured rectangularly instead of diagonally. Some 
alternatives to these choices are suggested. 

3.2. Fitting Procedure 

The model has been fitted to the data by a modified maximum-likelihood 
procedure. We observe that if the denominator of (4) is regarded as a known 
constant each dealer becomes a separate estimation problem involving two 
parameters, aj and bj , and 140 data points, the dealer’s sales in each cell. Given 
all the aj and bj , the denominator of (4) can be calculated. This suggests an 
iterative procedure. Suppressing the dealer index j, we let 

m = sales of dealer j in cell *, 

m = — ' tP ~ $(*, m(k))a k e ~ btlii - k) , (7) 

2 

z.-= i 

pi —TV(ae~ bx * — probability that a buyer in cell i will purchase at dealer/. 

Then, assuming independence of purchase, we find that the likelihood function 
for the observed sales figures is 

L =Yl ( 8 ) 

i= 1 \ 71 1 / 

We start with a trial set of wt . Values of a and b are then chosen to maximize 
L. This is done by setting the derivatives of log L with respect to a and b to 
zero and solving the resulting equations by Newton’s method. The a and b 
are calculated for each dealer and are then used to recompute the rr,* from (7). 
The process is repeated until the values of the a's and b’s converge. 

3.3. Results 

Figure 2 shows the difference between actual sales and model-predicted 
sales for the 184 dealers. The two are very close. This, of course, is desirable, 
but it is only a moderately good measure of model fit. Although the fitting is 
done on cell sales rather than dealer sales and there is no requirement that 
dealer sales fit exactly, it would be expected from the nature of the calculation 
that the fit would be close. Figure 3 compares actual and predicted penetration 
by cell for one dealer. Most of the differences are the result of random fluctu- 
ations arising from the small sample sizes involved in any given cell. 







AN INTERACTIVE MODEL OF THE CHICAGO AUTOMOBILE MARKET 


G 


>> 

a 


.© 

£ 

Co 


co o 


c * 

8.S 

u « 
o cj 

CO co 

o g* 


ti CU 

.S f= 


a y 

£ a 
o > 

u 5 

o 

-c c 


T 3 

C 


J 3 o 

~ a 


g .« 'P 
o -c £ 
u 


^ £ 

•J £ 

_ u 

a v 

C - 52 
tc.E ^ 
*55 g) a 

C 3 ~ 


c 

o 


£ 

o 


c 

o u a 

g 13 *" 

— «J w 

*P "C o 


.£ *£ 


o 

f « 

Sp 


o 

£ 


52 8 E 


o 

-C 


-O 

c 

CZ 


>> 


"P 

o 

u 

tc 

3 


a 

a 


G 

O 

T3 


g t 
•£ es 


0} 

CO *J 

o c 

Jr £ 

8| 
C ^ 

0 T3 

C to 

*P.G 
G co 
<u ^ 

o g 

a**" 

g £ 

*8 
o — 
co T3 
c; o 

o A 
u a 

a a 

• G u 


cs .2 


* o 


T3 


H 

3 

M 

S 3 

x 


<3 


jS 

< J 

tt d 

« d 

H iA 


w 

j 

a. 


o 

w 

P 


<N 

<N 

Ol 


•J 

W 

o 

o 


z 

£U 


<N 

O} 

OJ 

C£ 

W 

2 

rS 

D 

Z 

CS 

w 

>-* 

< 

K 

O 

o 

o 

o 

Q 


cn 

p 

z 

p 

Z 

o 

u 


CN 

OJ 

CN 


O 

tC 

G 


z 

Oo-iin 
r o) C 7 >n 

LO r— 1 

p *-:~q 

z 

« 

p 


KOJroO 
M CO h* 


cN g § 

k* r 

5 S « „ 

a K h g 

u g§J 

_ CO . 


O 

O 

vO 


O 

a 


to 

o 

A 5 


W 

D 

Z 

H 

z 

o 

U 


c 

i) 

o 

a 


z 

p 

CM 

O 

cn 

CO 

0 

Z 


< g 


^#* 

LO 


308 



how (lid the increase in potential affect Dodite 



,hm hat been the effect on each of the other makes? 



OrHf^rHC^OH 

X CM 

u 


KTHrf-oJvOOO^OvO 


^rhmvD^Ont^OON 
SrM'fol'OOOThi-io 
UtCM-*T-r— ‘CMf-<CM*— <CM 


J£OOOOOOOt-h^— » 


tg oo h* oo co uv^* \o om^ 

Kn't CM <N CM CO 

P-< 


CUCOt^COCOLO^-vOOO’M" 
£ CO ■<** CM CM CM CO 
Uh ~ 


CQ CO ^ 
ID 73 O 

W C to 

O vo 

H y v> 
JZ v 

H w g 

£ § 
. J2 
C 73 w 
c3 C o 

H « £ 
to 

gs « 

^ P o 

°*l B 

« be 8 
•C c 

♦*•3 O 


S \D ON \C 'O OD TH rt> n 
y^OvDNrHrMnO 
5inTH003HvOrt-h 

<T-<moOoooo 

o r r r f \ r r 


U'OTfOOOCN^’tin 
ytj-rHvoc>nOhO 
MtnsocMor^coLoco 
^^rt-^-hrorooOCM 
[4CMn^r\j0'-'O0 
? VO CO CM t-O rH O N CO 
^ rH rt* IN t— < 


a"2 v > 

s g « 

g 8 rt 
P 0) « 
^ « 

woo u 
y, ^ 

1)1 73 4) 

C SE 

a 3 *3 


vOnwocMOnw 
n^^NNnrHcNh* 
S O h CM\ ON VO 

^f^.tn«r-<rv|rf-ONCM'7’ 
dCoCHnorHOn 
S n ri icj t-I o oo cm 

^ -t* ^ — 


|MOh t 


ih\00(N 
CO CM rj“ O 
CO LO 


H 

2 ^ VO o to ri CO c?\ N 
2 MD — CM tO O O to tO vO 
UO-r^OOCOlOCOOOTj-CM 
U)00*^'r-«CX T*- CO ■<— <CO 

? CM 7- 

Pu 


w r^r^or^ocMcooo 
^OCh oo vD(N 
^HOOfOtOOOOOriO 
^ oo o t-i r-» Th co ^ 

g « ^ 


HE 


E?asgg 

O J D J O o 


cu cu (p O Q h 



A GFOGRAPHfCAL MODEL OP AN URBAN AUTOMOB I IX MARKET 311 

For a better appraisal of fit we have hypothesized that, for a given dealer, 
the probability of purchase in cell / is not />, but />•, where 

Pi —AO € i) 

and u is a random disturbance term. The standard deviation a of ct may be 
viewed as a type of coefficient of variation. The value of a has been estimated 
for each dealer and has a median value of .32. This seems reasonably good, 
considering that it refers to the individual cell probabilities. 


4. AN INTERACTIVE COMPUTER SYSTEM 

Conceivably we could use the model to work out a mathematically optimal 
pattern of dealers over the city or, more modestly, the optimal location of a new 
dealer. Such an optimization would probably be sterile. A decision on a dealer- 
ship involves many factors not included in the model: the availability of property, 
financing, the microgcography of the location, and so on. Perhaps some of these 
factors can be modeled, but up to now they have not been. Yet we do not want 
to wait to take advantage of what we can learn about macrogcogrnphy and 
competitive interaction. What is needed is a convenient way to make the infor- 
mation available to a person working on dealership problems. 

To demonstrate how this can be done, an interactive computer system 
was programmed for the mode! on the Project MAC time-shared computer 
at M.I.T. With this system, a user can sit at a remote console of the computer, 
make hypothetical changes in dealerships, and learn immediately the model's 
prediction of the effects. See Exhibit I for an example of the system in action. 
Changes that a user can make include adding a dealer, moving a dealer, elimi- 
nating a denier, changing a dealer's a and h parameters, and changing the 
potential of a cell. 


5. POSSIBLE IMPROVEMENTS 

Experimentation with the functional form of the distance relations would be of 
interest. Similarly, different measures of "distance’* might be tried. Travel 
time has strong intuitive appeal. Much better market segmentation is possible. 
The natural one would be to break down the cell population by make-model 
year of car owned. Then make preference could be related to the buying rates 
of each segment for each make. Another subject for investigation is the effect 
of clustering of dealers. Perhaps there is a special advantage to being in an 
“automobile row** because the row itself attracts customers. If so, the mode) 
might he modified to account for this. A highly desirable line of research would 
he to investigate how the a and b parameters are related to various observable 
characteristics of the dealer. 


6. REFERENCE 

[1] J. W. James, “An Annly&K of the Optimum Market Representation Policies 
Relating to a I.arRc Metropolitan Market,** MS thcM<, MJ.T., 1964. 



312 


SHIV K. GUPTA 


UN MODELE GEO GRAPHIQUE D’UN MARCHE 
AUTOMOBILE URBAIN 

r£SUM£ 

Une personne choisit un centre d’achat partiellenient pour sa location 
plus ou rnoins pres de chez elle. II y a done une relation de distance et de L : ^ 
qui est importante dans la vente sur le marche. Nous allons etudier un nouvea** 
commerce de vente automobile urbaine. On etablit alors une echelle d* 
distance calculee depuis le place de Pacheteur a elle du vendeur. Les suppo- 
sitions du comportement de Pacheteur sont la base de Pattitude adoptee park 
v r endeur, Le modele est adaptee sur une statistique variante pour une periods 
de trois mois a Chicago. Un programme de calcul reciproque par cerveau 
electronique resoud immediatement pourquoi la vente a eu lieu a un tel endroft 
plutot qu’a Pautre. Par exemple, on peut additioner, eliminer, ou deplacer un 
magasin et en examiner les modifications dans les ventes qui en resulte. 


ON SOME MARKETING MODELS 
Sur Quelques Mo deles de Marketing 

Shiv K. Gupta 

Indian Institute of Management 
Calcutta, India 


1. INTRODUCTION 

This articlef represents an attempt to formulate a generalized model of sales as 
a function of the factors that influence it, namely, advertising, price, distribu- 
tion effort, and product quality. The differential equation approach suggested 
opens up the possibilities of a unified approach to the marketing problems 
which has engaged the attention of many operations research scientists! m 
recent years. 

Starting with the simple case of a single firm and a single product situation, 
the differential equation approach offers a suitable framework for extension* 
to the multifirm and multiproduct situations. Several sperific models have been 
considered. 

f The author acknowledges the helpful suggestions and comments of his collsag*-- 
Mr. K. S. Krishnan and Mrs. Lakshmi Mohan. „ , 

% A detailed bibliography is given by Dr. Jerome D. Hemiter and Dr. Hons. 
Howard in Progress in Operations Research , Vol. II. 



ON' SOME MARKETING MODELS 


313 


2. DIFFERENTIAL EQUATION APPROACH 

This section denis with the relations between the total sales and each of the 
decision variables taken separately. Further, a genera! mathematical model is 
developed between sales and all the decision variables taken together. 

2.1, Single Firm and Single Product 

(a) Total Sales as a Function of Advertising Only 

Let the sales be an increasing function of advertising. The incremental 
sales, however, tend to zero as advertising increases indefinitely.! The rapidity 
with which the sales function flattens depends on the nature of the product 
and the marketing situation. Let 

S = a[\ — cxp(~yA)] f 

where S — total sales, 

A = advertising expenditure, 
a — maximum market potential, 
y = constant. 

Then 

dS 

— — ay cxp(-yA) 

— yi a — S) 

— <pi{a — S), 

where <pi(n — S) is some function of the untapped potential ( a — 5). For a 
number of other models it is found that 

2§«9,(«-S). (1) 

If the form of <p\ is known, the differential equation (1) can be solved. The 
following table gives the explicit relationship between S and A for various 
functional forms of rpi . 

ff’i(a — S) S 




(a-S)* 


ki±h(a-S) 


h{a ~ S) + kz(a - sy 


n[l — exp(-~/I)] 


[1 — c xp{-~kiA)] 

A) ~ 1] ah i 

— — where y *- - — - — — 

cxp(kiA) — y hi+aftt 


a*A 
1 4- a A 



a[cxp{h 


t In some ensw it has Ivecn observed that exec*;? of advertising ha^ a negative effect 
on jalca. 



314 


SHIV K. GUPTA 


(b) Similarly, when price is the only control variable, it is found that 

dS 

d£ = -^ S) ' ( 2 ) 

where 92 (5) is some function of 5. If the form of 92 is known, then (2) 
be solved to obtain the relationship between sales and price. 

The following table shows the relationship between 5 and p for various 
forms of <p 2 • 


<P*(S) 


a exp (~bp) 


bp — a 


-(ki S + k 2 S*) 


exp[(kip) ~k 2 ] 


(c) Sales as a Function of Both Price and Advertising 

When sales S is a function of both price and advertising, the total incre- 
mental sales can be expressed as 

8S 3S 

dS = dA dA + 8p dp ' 

where dS/dA and dSjdp are partial incremental sales with respect to A and p. 
Using the models postulated in (a) and ( b ) 

dS = 91 (a -S)dA- <p 2 (5) dp , (3) 

where a is the sales when A oo for a given value of p. The necessary con- 
dition for (3) to be solvable is 

^< pi (a-S)+£ i <p z (S) = 0. (4) 

The explicit relationship between sales and price and advertising can be found 
only when (4) is satisfied. The following table shows the relationship between 
5 and p for various forms of 91 and 92 : 

91(0 — 5) 92(5) 5 

a(p) - 5 b(A) (ki - bp)[ 1 - exp(— -4)] 

a(p) — 5 b(A)S kie~ b P[l — exp(— A)] 


[a(p)-sy 


b(A ) ki -bp - - 7 - , where ki.kt, and b are constants 
«2 + A 


ON SOME MARKETING MODELS 


315 


(d) Sales as a Function of tx> Variables 

Suppose that sales depend on tr-variables x*, .... .v fr . The effect of 
variable xj on sales is given by (S, ex; ~ 97 , where g; is independent of xj 
and a function of *S and control variables other than .v^ , 

The total sales S then satisfy the differential equation 

(IS = ^ 97 dxj. 

The condition for integrability is 

i -(&- -^1-0. (5) 

J* 1 <pj<pj+l V GXj + 1 CXj / 

where 

— (pi and — X}. 


2,2. Multicompetitors and a Single Product 

Suppose there arc n competitors in a market for a single product. The total 
sales Si of the ith competitor arc given by 

w'hcre 

S = total sales of all competitors, 

ft ss share of market of the ith competitor. 

It is reasonable to assume that the behavior of the share of the market with 
respect to the various factors, such as price and advertising, will be similar 
to that postulated earlier for total sales; that is, 

jgf-mfa-fi)' 


cf± 

Cpi 


—<pts(/ 0 , 


where At and pi arc advertising expenditure and price of the ith competitor, 
respectively, and 37, the maximum share of the market of the ith competitor; 
37 is independent of At* Hence dft — 971(2/ —fi)dA( — -c n(fi) dpi* In general, 

when there are c r/ control variables xtj (f — 1 , 2, n\ j =■ I, 2, ... , tc/) for 

the ith competitor, then 

dft ~ S 97; dx 0. 

The condition for integrability is 


«■ 1 

v 


1 




(Jill 


^7 US* 1 
< x t.J 



when / — 


where 


7US~7U i 

x t* ! ~ x (, l 



316 


SHIV K. GUPTA 


2.3. Multicompetitors and Multiproducts 

Consider a market situation in which there are n competitors, each of wh 
sells m products of different qualities. The decisions on price and advertis° m 
expenditure by the competitors for one product affect the sales of the other 
products. The total sales Sj for the jth product are assumed to satisfy the 
differential equation 


dSj 

wr^ 


■Si), 


where <pj is a known function and X ) is the effective advertising for product i 
defined as follows: 


where 


Xj — mkjjEj — 2 kjxEx, 
A = 1 


E J= 2 ( a 
i 


with a tj, €{j, and kjx as constants; Sj depends on advertising only, and Ay is 
the advertising expenditure of the ith competitor for the jth product. The 
net revenue R{ of the zth competitor is then 

m 

Rt—1 {SjfviPu — c a) — An}- 

j= i 

The share of the market fy of the ith competitor for the ;'th product depends on 
the form of <pj . The problem is to maximize Ri(i = 1, 2, , n) with respect 
to pi } and A t] (j — 1,2,..., m). 


3. SPECIFIC MATHEMATICAL MODELS 


Suppose nissl, that is, suppose there is only one product and each competitor 
has two control variables. Let the sales of the 7 th competitor be given by 


where 


Si = S 


Vi + u 


(atAt) e ‘ 

2UMY' 


+ 


nk(p—pi) 

(»-l) 


S = total sales, 

A{ = promotional effort of the ith competitor, 
p i = price charged by the ith competitor, 

P=~ 2 ft* 

71 i = 1 

Vi, u , and k are positive constants such that 


n 


2 Vt~h u — 1. 
i — l 


ON SOME MARKETING MODELS 


317 


We can easily verify that cSiidAi is of the form rfn(ai — Si) and that BSilBpt 
is of the form — ^ 2 (^ 1 ), where a { is independent of At . Hence R { — 
St(pt — C() — At. This problem has been solved in [3] when the total market 
S is as indicated. 

(a) Model I: S is constant, independent of price and advertising. 

(b) Model II: S depends on advertising only. 

(c) Model III: 6' depends on price only. 

(d) Model IV: S depends on both price and advertising. 

When the total market S is independent of price and advertising and the 
prices of all competitors arc equal and known, the model [3] discussed in this 
article reduces to Mills’s model [2]. When both price and advertising arc 
decision variables, the equilibrium values of pi are found to be different; hence 
the results of Mills differ from the results derived here. 

In particular, when n = 2 and sales Si for the *th competitor arc given by 


St^S 


*{A( 

aj A 1 a 2 A 2 


Kpi-Pi)\i 


where S — total market (assumed to be constant), then BS^BAi and cSrfBpi arc 
of the forms 1 jn(S — Si) and —fpiziSi), respectively. This model has been solved 
in [4]. Its solutions have been expressed in terms of two parameters 


7. — — and A — kfc — c 1). 
a 2 

As a particular case the equilibrium solutions have been obtained when the 
promotional efforts of both competitors arc equally effective, that is, when 
a — 1. For nonboundary solutions the results in [4] coincide with those of 
[2] when equilibrium values arc substituted for pi and />?* For some values of 
ar. it has been found that there exists A such that the profits of the two competitors 
in the equilibrium situation arc equal. 


3,1. Multiperiods: Single Firm and Single Product 

In the models postulated earlier all the decisions pertain to a single period. 
Because the decisions taken in any period affect decisions in subsequent periods, 
it is necessary to study the distributed lags in advertising and other promotional 
efforts. The effects of promotional effort in various periods have been studied 
by a number of authors who used time series analysis. To measure the effect 
of advertising in terms of net revenue* Julian Simon [5] discussed a method 
under the assumption that the effect of advertising diminishes at a constant rate. 
He has shown that under this assumption the model for measuring effectiveness 
requires know led ce of 

(a) sales in the current period, 

(b) sales in the prior period, 

(c) the retention rate (assumed to be constant). 



318 


SHIV K. GUPTA 


Here, by use of the assumptions in [ 5 ], the behavior of the sales function * 
studied with the differential equation approach. 1 ' W ° 

Let At be the promotional expenditure in the *th period. Let the effect of 
promotional effort be distributed over several periods and diminished at 
constant rate, and let Et be the effective promotional expenditure in time t 
which includes part of expenditures in periods before t. 

In terms of the actual promotional expenditures At , we have 

Et — ktAi -\- kt-iA% + • • • k\ At 
t 

= 2 h+i-iAi* 

1 

If St and St- 1 are the sales in time t and t — 1, respectively, it can be shown 
[ 5 ] that the sales due to E t are (1 — by* St — 6(1 — by*St-u where b is the 
retention rate. Hence S t — bSt-i+ft(E t ), where f t (E t ) is some unknown 
function of Et . 

St - i ~ft{Et) + bft-i(Et-i) + f- b l ~ l fi(Ei). 

When t = 1 , the sales in the first period are a function of E\ only. Hence 
dSi~(dSildEi) dEi=<pi{ai —Sx)dEi 1 where — Si) is a function of 

untapped potential (a\ — Si). When t = 2, the sales in the second period 
are a function of E\ and E%: 


dS% @Sv 

ds *-d dEl+ ei dE *> 


where 

S 2 = bS\ J rf 2 (E2). 

Hence 

8S 2 , 8S 1 


8Ei~ h 8Ei~ brpl ^ ai 

and 



dSz 

8E 2 


= 92(^2 — *^2), 


where 92(^2 ~ S%) is a function of the untapped potential (< a 2 — S 2). The 
maximum market potential az in the second period is a function of In 
general, 


70 dSf , 3 St~ 8 S t 

dSt ~ 8E1 ' dCl + 8E 2 dE<l + + 8E t dEt 


= M ^951(^1 — S\) dE\ -f- V' ^cp2{o,2 — S 2) dE% 4 (- (ptfat St) dEt > (6) 

where at is the maximum potential in the /th period and a function of £1, 

Ez , . . . , Et- 1. 

If the functional forms of <pi, 92, <pt are known, then S 1, *?2, .-1 w 
can be found by solving (6) successively for t = 1 , 2 , 3 , ... . 

In [6] this approach is extended to solve the multiperiod case of multi- 
products and multicompetitors. 



MARKETING 


319 


4. REFERENCES 

[1] M. F. SllAKUN, “Advertising Expenditures in Coupled Markets/* Management 
Science, Senes B, 11, 42-47 (February 1965). 

[2] H. D. MILLS, in Mathematical Methods and Models on Marketing! Ba^s et 
Wiley, Ncv. York, 1961 

[3] K. S. Krishna v nnd Siuv K Gupta, ’* Mathematical Models in Marketing" (un- 
published). 

[4] K- $. Kiushnan and Shiv K Gupta, “ Mathematical Model for a Duopolhtic 
Market/* accepted for publication in Management Scter.ee . 

[5] JULIAN Simon, “Arc there Economics of Scale in Advertising? " J . Adteritsittg Res., 
5, No. 2, 15-20. 

{6] Smv K. Gupta and K. S Krishnan, “ Differential Equation Approach to Mar* 
kcting" (unpublished). 


SUR QUELQUES MODELES DE MARKETING 


r£sum£ 

Lc but dc cut nrticlc cst dc monirer 1c rapport entre les depenscs publicitaircs, 
les prix, les debouches et In qualite dcs produits cn consequence de leur vente 
et dc leur profit. On cmploic In methode de 44 In thcoric dcs Jeux” pour crecr 
les modules dcs problemes commcrcinux. On obtient les solutions pour lc 
ens de X competitcurs et pour dc multiples produits. Les applications et les 
rcsultnts sont illustres par dc nombreux exemples. 


Un Module de Simulation pour Evaluer PEfficacitiS d*un 
Plan de Supports 

A Simulation Model for Evaluating the Efficiency of a Media Plan 

N. Steinberg, G. Comes, et J. Raracite 
S oriel e A V ROC, France 


Lc but dc cc document est dc montrer comment a etc concu et realise un 
systeme automntique permettant: 

— d’cvalucr l’cfiicacite d*un plan de media cn termes de perception des 
messages unis. 

Le programme de calcul ecrit pour IVirdin3tcnr IBM 7094 ^/intitule: 
Mcdia-Plnnex* 

Lc planning dcs media a pour ohjectif de dfctribuer qunntitntivcmcnt dcs 
menage* dans In population cn v«c dc maximer la perception de rinformntion 
cxnisc. 



320 


ABSTRACTS 


Les elements de base de cette approche sont les suivants: 

— les enquetes de frequentation des supports publicitaires par la population 
— les enquetes de perception des messages publicitaires selon une echelle 
semantique en fonction de Pexposition de la population a la publicite 

Dans sa seconde phase, il effectuera egalement sur la base des raemes 
criteres la selection optimale des media et des supports publicitaires. 


The purpose of this article is to show how an automatic system can be set 
up to permit the assessment of media-plan efficiency in terms of recall of an 
advertisement. The computer program written for the IBM 7094 is entitled 
“Media-Planex.” 

The first of the media-planning objectives is to distribute messages to a 
population to maximize perception or recall. The basic steps are the following: 

— media surveys of the reading, listening, and viewing habits of a population- 

— review of post-testings. 

Media-Planex computes media-plan efficiency as a forecast, that is, the 
percentage of the population that will recall the messages. In addition, a pro- 
gram called Media-Planex optimum selects on the same basis the best media 
plan for a given advertising budget. 


A Model for Marketing 

TJn Modele de Marketing 

K. A. Jones 

Mobil Oil {Australia) Ltd. 

G. Gregory 

University of Melbourne 
Australia 


In a study of gasoline marketing in Melbourne an attempt is being made to 
rationalize the siting and operation of the service stations of a particular com- 
pany. Three approaches are being used: 

1. Consumer survey work to measure the preferences and opinions of 
buyers. 

2. Analysis of observations of the behavior of relevant characteristics in 
the location of marketing. 

3. Study of the economic organization of selling units in anticipation o 
fluctuating market conditions. 



MARKETING 


321 


A brief account is given of the problems in consumer survey work peculiar 
to this situation. For the analysis of observations of a service station a mathe- 
matical model has been set up to allow for the random variations in the arrival 
pattern and the sendee times of customers and also to take into account the 
tendency of customers to balk when too many customers arc already present. 
The model is analyzed by methods of queuing theory'. Uses to which this 
analysis may be put are outlined. Relevant factors in the study of the economic 
organization of sendee stations are given. 


Dans unc etude sur la vente de Pesscnce Melbourne, on e’est cfforcc dc 
rationaliser la distribution et le fonctionnement des stations-service d’une 
compngnic pnrticulicre; on cmdsngeait trois mcthodcs: 

— unc etude du marche de la consommation pour evaluer Ies preferences 
et les opinions dcs achcteurs, 

— analyse dcs observations faites sur revolution des caracteristiqucs sc 
rapportnnt h la distribution dcs marches, 

— unc etude dc l'organisation cconomique dcs points dc vente cn prevision 
de fluctuations des conditions du marche. 

II est donne un bref resume des problemcs dans un travail sur le marche de 
la consommation rclatif ii cette situation. Pour 1’analyse des observations faites 
a un point de vente, on a etabli un modelc mathematique tenant comptc dcs 
variations dc hnsard dans le rhythmc dcs arrivccs et le temps ddvolu h chnquc 
client, et tenant aussi comptc dc la tendance qu‘ont les clients h hesitcr quand 
il y a trop de mondc qui attend deja. On etudic le modcle en utilisant les 
mcthodcs de la theoric dcs fils d’attentc; on souligne les utilisations evcntuclles 
de ccttc analyse. On donne les factcurs en rapport dans Petude de l’organisation 
cconomique dcs stations-service. 


Le Comportement Spatial de la Clientele des 
Supermarchds Marseillais 

Spatial Distribution of the Clientele of Marseilles Supermarkets 

P. Drouet et A. Leroy 
Marseille , France 


Cette communication presente quelqucs resultats obtenus dans le cadre dc 
rcchcrchcs mcnccs par le Centre d’Ftudcs de la Distribution sur Pequipement 
commercial et le comportement dc la clientele dans le Slid -Est dc la France. 

t*nc enquete portant sur plus dc 4000 clients de 4 supcrmarchcs de ^agglo- 
meration tnarscillais. File a permis de degager que la repartition dc la clientele 
en function dc la distance domicile-magasin s’ajustai? a unc lot log-normalc. 



322 


ABSTRACTS 


Par ailleurs, les supermarches de la ville consideree paraissent encore jouer un 
role tres important de “commerce de quartier” traditionnel et leur fonction ne 
correspond pas a la conception du supermarche que Yon recontre couramment 
dans d’autres endroits. 


This article gives the results obtained through a “ Centre d’Etudes de la Distri- 
bution research program on commercial equipment and customers’ behavior 
in southeastern France. 

A sample of more than 4000 people was interviewed in four supermarkets. 
It was possible to apply a lognormal law to the distribution of customers accord- 
ing to the distance from their homes to the shopping center. Furthermore, in 
this town it appears that supermarkets are still viewed as traditional neighbor- 
hood shops and that their function is far from the usual concept of supermarkets 
in other places. 


The Long-Term Equilibrium Conditions of the Plastics Industry 

in Japan 

Les Conditions d’Eqnilibre a Long Terme pour Vlndustrie Plastique 

au Japon 

T. Takehiko Matsuda 
Tokyo Institute of Technology 
T. Yokoyama 
Osaka University 
O, Abe 

Tokyo Institute of Technology 
Japan 


An attempt has been made in this article to develop an integrated system of 
forecasting the demand for polyvinylchloride over a long period. The system 
includes supply and demand structures and a method of price determination. 

Because demand forecasting of polyvinylchloride must take into account the 
supply and demand structure of the entire plastics industry, we have considered 
the following two points. 

1. Competition with other plastics or materials in demand. 

2. Making price endogenous, since price is used as an explanatory variable 
in this system. 

Competition in the market is based on the characteristics and the market 
price of each plastic. On the other hand, price is controlled by the balance of 



MARKETING 323 

supply and demand; the supply of each plastic is considered a function of the 
cost of production. 


Cette etude propose de developper une xnethodc d’integration pour le calcul dc 
la demande a long terme dans Bindustrie du polyvinylchloride. La methode 
integrera les factcurs d'ofTre et de demande et la formation dcs prix. 

Le calcul de la demande pour le polyvinylchloride peut etre execute en lui- 
tneme aussi bien que calcule dans la structure d'offre et de demande de toute 
Tindustrie plastique. On peut done comiderer deux facteurs: 

— la concurrence du polyvinylchloride avee les autres plastiqucs, 

— rctablissemcnt d'un prix endogene commc unc variable explicative du 
sy st erne. 

Le prix de chaquc plastique determine la concurrence dans le mnrchc. 
D'autre part 1'cquilibrc d’otYre et de demande regie chaque prix du plastique, 
Le prix de base en determine Toffrc. 


Socioeconomic Status of the Family and Housewife Personality, 
Life Style, and Opinion Factors 

Statut Socio-Fconotr.iquc dc la FamUlc et Personality 
Style dc la Ute f ct Fact curs d' Opinion dc la Menagere 

E. A, Pessemier and D. J. Tigert 
Purdue University, Lafayette , Indiana , United States of America 


One major objective of the ongoing “Consumer Behavior Research Project” 
at Purdue has been the investigation of ways to expand the range of measurable 
consumer characteristics that may be useful in predicting purchasing, communi- 
cation, and other market-related behavior of homemakers. This article discusses 
the identification of factors associated with housewife personality, life style, 
and opinion characteristics, and their relationship to selected demographic 
measures. 

The analysis reported here uses data from three sections of a large, self- 
administered questionnaire completed by 540 housewives from the greater 
Lafayette, Indiana area: (a) life style data including daily activities, interests, 
and opinions, (h) personality data, and (c) demographic information. The 
data from sections (a) and (b) were factor-analyzed, and a fnctor-scorc profile 
was computed for each subject. Then simple and canonical correlation analyses 
were made between the factor scores for indhridunl housewives and demographic 
measures for the same subjects. Finally* a multiple regression analysis between 
the Bureau of the Census socioeconomic status score of the family and hou e c- 
wife- factor scores was completed. 



324 


ABSTRACTS 


Un des objectifs majeurs du “projet de recherche sur le comportement du 
consommateur” qui est actuellement en cours a Purdue a ete de trouver des 
moyens d’elargir Teventail des caracteristiques calculables du consommateur 
qui peuvent etre utilisees dans la prevision d’achats, de communication et de 
toute autre action des fabricants concernant le marche. Cet article etudie 
l’identification de certains facteurs associes a la personnalite, la maniere de 
vivre, et les opinions caracteristiques de la menagere et leur relation avec les 
chiffres demographiques choisis. 

L’analyse suivante utilise des donnees obtenues par trois sections differentes 
d’un long auto-questionnaire rempli par 540 menageres de Lafayette, Indiana, 
Ces trois sections renferment des donnees sur les points suivants: (a) genre de 
vie (comprenant activites journalieres, secteurs d’interet et opinions), (b) 
personality (c) informations demographiques. Les resultats des sections (a) 
et (b) ont ete analysees, mises en ordre logique et sous forme de chiffres codes, 
et Ton a cherche pour chaque sujet a representer ces chiffres codes. Puis il a 
ete effectue des analyses de correlation simples et strictes entre les donnees 
codecs pour les menageres individuelles et les chiffres demographiques pour 
les memes sujets. Enfin on a procede a une analyse a regression multiple entre 
les chiffres de T etude socio-economique du Bureau de recensement concernant 
la famille et les donnees codees pour les menageres. 



SESSION C 


TRANSPORTATION 

Transport 

Chairman: E. C. Williams ( United Kingdom) 




OPERATIONS RESEARCH IN 
NORTH AMERICAN RAILROADING 


La Recherche OperationneWe dans les 
Compagnics Ferroviaires Nord-Amcricaincs 

Petek B. Wilson 

Research & Development Department , Canadian National Raihcays 
Montreal, Quebec , Canada 


1. INTRODUCTION 

Contemplating and attempting to analyze complex systems is a very popular 
pastime nowadays. If we were to set out to design an ideal transportation 
system, wc would naturally begin by studying the nature of the demand for 
transportation. This would be a study of the characteristics of the items to be 
moved, the distances and volumes involved, and the speed, reliability, and other 
special-service requirements. Next we would study the characteristics of the 
various modes of transportation and their strengths and weaknesses in meeting 
the demand and services required. Finally wc would sit down to design an 
integrated system made up of the proper admixture of the different modes of 
transportation. 

Unfortunately, the North American transportation system was not con- 
ceived in this way. Like Topsy, it just grew. During its growth it has been bent 
by a variety of extraneous factors, political, social, and technological, which have 
distorted the effect of normal economic forces. The result has been a chaotic 
patchwork suffering from what has been described as “a history of specific 
actions addressed to specific problems of specific industries at specific times ** [I], 

The railroads are no exception. They emerged somewhat abruptly from a 
monopolistic, noncompetitive period in which they had more or less their own 
way and in which design and construction were unrestrained and uncoordinated. 
They are now facing a challenge from new and aggressive competitors and at 
the same time are plagued by unprofitable services and by regulations and 
practices that are outmoded relies of the old monopoly days. Their continued 
existence as a healthy and viable force depends on their ability to adjust to the 
changing transportation environment. It is of interest in this connection that 
North America is the only major area in the world in which railroads are not 
state -owned. 

We seem to have here a natural field for operations research. These are 
precisely the conditions under which operations research was born and in which 
many feel that it had its finest hours. 



328 


PETER B. WILSON 


Alas, no Blackett has appeared. Despite the publicity given to operations 
research in the postwar years and the many approaches and proffers of help to 
the industry, it has not made any significant penetration of North American 
railroading. In order to understand why, it is necessary to examine some of the 
characteristics of modern railroading. 


2. SOME CHARACTERISTICS OF RAILROADING 


A railroad is, on the surface, a fairly simple system. A car is loaded, proceeds 
by means of a pick-up and delivery (P & D) system to some assembly point 
where it meets other cars, is sorted with them on a destination basis, and even- 
tually departs on a train headed in the general direction of its destination. It 
may have to be resorted en route, change trains and even railroads, but even- 
tually, we hope, it reaches the destination terminal. Here another P & D system 
collects the car and delivers it at the consignee’s siding for unloading. 

This is basically no different from other transportation systems. What makes 
railroading awkward is the sheer weight of the numbers involved and the com- 
plexity of the interactions between the many elements that make up the system. 
In any year there will be literally millions of individual car movements between 
millions of different origin/destination combinations. A recent attempt to 
computerize the railroad tariff system indicated, for example, some 40 trillion 
possible tariff combinations. On a typical railroad there will be at this moment 
some 100,000 pieces of freight-car equipment of many different shapes and 
sizes, some specialized by commodity, some general purpose, some empty, 
some loaded, bound for many different places. Most of these pieces will be 
handled at least once today. The volume of data handled by the average rail- 
road is enormous by any standard, and obtaining timely data for operational 
control presents difficulties unparalleled in any other industry. 

The apparent simplicity of a railroad system is a snare and a delusion. We 
are dealing, in fact, with an extremely complex system, one of the most complex 
conceivable, and in which the interactions between the elements are imperfectly, 
if at all, understood. Railway costing, for example, is a nightmare. This is 


clearly no place for dilettantes. 

Because of the foregoing, management tends to be saturated with a vast 
number of problems of day-to-day control of available resources and in reacting 


to opportunity or crisis situations. This saturation with immediate problems is 
not conducive to a research orientation. To put this down to disinterest in 
research would be an oversimplification, although this disinterest often does 
exist. It is rather a case of not having time to wait for research people to learn 
the basics of railroading. Until recent times most research has been of a tech- 
nical or product-improvement variety and some excellent work has been carried 
out in these areas. Interest in systems research has developed more sknvly. 

The industry is an old one and the procedures and practices in use have 
evolved, mainly by trial and error methods, over a long period of time. They 
may not be optimum but experience has shown that they work. Moreover, the 
people working in the system have for the most part grown up in it and, like 



OPERATIONS RESEARCH IN NORTH AMERICAN RAILROADING 


329 


Professor Iliggins, have M grown accustomed to her face. 0 They arc proud and 
dedicated men motivated by a strong desire to do the best possible job for their 
companies, but their experience, education, and tradition militate against 
easy acceptance or adjustment to innovations, particularly if the new proposals 
come from outside the “family." 


3. HISTORICAL RfcSUMfi 

The first railroad in North America to become interested in the potentialities 
of operations research seems to have been the Chesapeake & Ohio. As a result of 
discussions between the OR group at Case Institute of Technology and the C & O 
in July 1951, a study was initiated to apply sampling methods to the settlement 
of interrailroad accounts [2]. The study was a technical success in that it clearly 
demonstrated the economic advantages of a properly designed sampling system. 
The implementation record, however, is less satisfactory. A few railroads 
arc now using sampling for interline accounting, but the methods are still far 
from general adoption by the industry, despite the fact that sampling has in the 
interval become standard practice for settling interline airline accounts. A 
later study by Price, Waterhouse on car-accounting methods, which came to 
similar conclusions, lias been no more successful. Some of the reasons for the 
slow railroad acceptance are reviewed by Churchman [3]. 

Some pioneering work was also carried out in the early 1950s by Mclpar (a 
subsidiary of Westinghouse Air Brake), one of the. railroad supply companies. 
The Mclpar team, headed by Glen Camp and Roger Crane, was subsequently 
asked by the Railway Systems and Procedures Association (later Railway 
Systems and Management Association) to put on a presentation on operations 
research to its railroad members. This was done in late 1953 and a follow-up 
seminar was held in February 1954 [4, 5J. Its participants read like a Who's 
Who in Operations Research in North America — Ackoff, Camp, Churchman, 
Crane, Johnson, McCloskcv et al. 

Tiic 1954 seminar had a mixed reception. Several railroaders were “quite 
outspoken in their belief that OR could be of real assistance." Others were 
more cautious regarding their management’s reaction because “management 
wants a quick pay-off on any expenditure" and because of some “doubt as to 
whether OR had enough experience to enable it to carry over into railroading 
some of the methods and results it has used so successfully in other industries." 
It was agreed that the OR people at the conference and the officers of the RSMA 
should get together to establish a program for operations research in the rail- 
road industry. Nothing seems to have resulted from this proposal. 

Approaches to the industry have been made since then by management- 
consulting firms and academic groups and some support has been obtained. 
On the whole, however, no great enthusiasm has been generated, and there has 
been very little follow-up by the railroads. At the present time only a few 
railroads in North America employ people designated as operations researchers 
and only one has a sirablc and well-established group. 



330 


PETER B. WILSON 


Names, however, can be misleading. A considerable amount of research in 
the OR tradition has been and is being carried out in many railroads under the 
banners of industrial engineering, systems analysis, transportation engineering 
and similar terms. If OR is what an OR practitioner does and the Canadian 
National group is doing OR, then most railroads are in fact doing OR. 

It would be surprising if this were not so. A railroad is a large and complex 
business organization with all the problems of large and complex businesses 
Many of these problems lie in “classical” operational research areas* for 
example, planning, allocating and scheduling of resources, replacement versus 
repair decisions, optimum levels of maintenance, establishment of inventory 
levels, capital investment policies, and development of management information 
and control systems. The availability of operations research techniques and 
their potential value is not unknown to the people with responsibility for making 
decisions in these areas. The use of PERT and CPM, for example, is common- 
place in many railroads. (Canadian National is probably the greatest user of 
CPM in Canada, and in at least one road in the United States the use of PERT 
is mandatory for projects above a certain value.) 

The picture therefore is not so black as the industry’s apparent nonuse of 
operations research professionals would indicate. Although it can be argued 
that most of the work carried out would have benefited by the direct involve- 
ment of more professionals, much of it is nonetheless sound and workmanlike. 
The models may be simple, but underlying them is often some excellent systems 
analysis backed up by field work and in many cases direct experimentation. 
Unfortunately, little of this work is published. 

In this review we deal mainly with published work that can be considered 
specific or unique to railroading. Moreover, we concentrate on three areas that 
are of particular importance in railroad operations. 

1. Classification yards and terminals. 

2. Line-haul operations. 

3. Equipment allocation and distribution. This merits special discussion 
because of the large amount of capital involved in equipment and the low 
productivity obtained from much of that equipment. 

4. OPERATIONS RESEARCH STUDIES ON 
YARDS AND TERMINALS 

Railroad operations are a mixture of high speed and delays, of regularity and 
inconsistency. The delays and the causes of the inconsistencies take place 
mainly in classification yards and terminal areas. The trend to longer trains is, 
if anything, aggravating the situation. On the surface this trend looks like a 
classic example of the dangers of suboptimization or at least of the danger of 
not identifying all of the components of the trade-off. By optimizing line 
operations to minimize crew costs and obtain full benefit from the economies 
possible with diesel engines cars are delayed longer, terminals become congested, 
costs go up, service goes down, and customers become unhappy. The optimum 
length of a train is an unsolved problem waiting for an OR solution. 



332 


PETER B. WILSON 


detail the performance of a modern automatic classification yard [10, 11], The 
study showed that the models used to control cars in these somewhat expensive 
installations were inadequate in many respects, and this together with the 
difficulty of allowing for wide variances in car reliabilities resulted in per- 
formance far from optimum. However, by using a combination of statistical 
analysis, developing a simple model of the yard operations, and making some 
inexpensive technical modifications, it was possible to reduce the effect of the 
variances in car reliability and match train and track assignments to traffic in 
an optimum manner. Substantial improvements were made in the existing 
system. Work is now proceeding toward optimization of the total hump system. 

Industrial switching is an important part of most terminal operations and 
absorbs about half the total terminal costs. In these operations the economies of 
mass handling of cars disappear, for cars are handled individually or in small 
groups from and to individual shipped sidings. Some profitable research has 
been carried out in this area [12, 13] by many different railroads. 

Because of the diffuse character of the operations, it is difficult to build 
a satisfactory mathematical model. A few railroads are now trying to get over 
this hurdle by means of computer simulations. The main developments so far 
have been improvements in control technology, rationalization of the organi- 
zation and command structures, and development of more appropriate infor- 
mation and communication systems. From these it has been possible to apply 
simple models to control the movements of cars and engines within the terminal 
area and to replenish shippers’ car inventories automatically with worthwhile 
gains in the productivity of cars and engines. In addition, the system is pro- 
viding information hitherto unavailable to establish meaningful indices to 
monitor performance and service and for some much needed operations re- 
search in a very gray area : the development of service criteria for movements 
from shipper’s siding to consignee’s siding. Siding-to-siding times are still a 
mystery to most railroads, yet these are the times that are most meaningful to 
shippers. The need to keep them within acceptable tolerances is essential in 
the competitive environment in which railroads are operating, and some excel- 
lent systems work in this areas has recently been carried out by the New York 
Central and the Frisco. Other railroads are without doubt active in the same area. 

5. OPERATIONS -RESEARCH STUDIES ON 
LINE-HAUL OPERATIONS 

Compared with the apparent confusion and complexity of terminals and classi- 
fication yards, over-the-road operations appear orderly and controllable. The 
main problems arise in planning train services, controlling train movements, 
and determining the facilities required to provide the desired service. 

Similar problems have been tackled by operations research people elsewhere 
and the work is well documented in the literature. Unfortunately, attempts to 
apply the standard mathematical approaches to these problems in railroading 
have not been too successful. The physical constraints of railroad reality have 
usually been too tough for the theory to handle. The one ray of light has come 
from computer simulations. 



OPERATIONS RESEARCH IN NORTH AMERICAN RAILROADING 


333 


Simulation has been described as the “brute-force” alternative to mathe- 
matical model building. It is more than this in railroading. It is often the only 
way of developing a realistic and, at the same time, tractable model. It has 
another big advantage over mathematical f ormuhuons : it is more readily 
understood and accepted by railroad executives. 

The Canadian National group has been particularly active in simulation 
development. For the most part the models have been completely deterministic 
and an almost direct translation of real operations. Two are in almost daily 
use [14, 15]. One simulates the running of a train on any section of track and 
is a basis for train scheduling, the provision of diesel units, and the determi- 
nation of power requirements. Another simulates the running of multiple trains 
on a single track under centralized traffic control. This is a fairly sophisticated 
feedback signaling system that relays track occupancy data to a central control 
point. It was developed to assist in the evaluation of a large capital investment 
program in new signaling facilities and planned improvements In track and 
siding arrangements. Both simulations have since been applied to a wide range 
of conditions other than those for which they were developed and are bringing 
a new look into the study of many railroad problems. One senior executive is on 
record as stating that computer simulations have 44 replaced trial runs and 
judgment In train operation planning/’ and although this may be going too far 
it at least indicates acceptance of the procedures by railroad management. 

Several railroads in the United States arc now using the Canadian National 
models and others have developed models of their own. A model of a railroad 
network, for example, has been developed by Allman [16] working initially 
under sponsorship of the National Bureau of Standards. The Frisco and 
several other railroads are extremely interested in this model and are now 
actively carrying out further developments and extensions to their own pur- 
poses. These developments could lead to some cooperative research by the roads 
concerned. Although the scope of all models developed so far is fairly restricted, 
there is no doubt that interest in the potentialities of simulation is increasing at 
a rapid rate in many railroads. 

One of the more interesting development programs in North American rail- 
roading was begun in 1962 when seven railroads combined to sponsor a ?1 
million simulation research project at Battellc Memorial Institute. This kind of 
cooperation for research is unprecedented in the industry. The original Battclle 
objective was to simulate those parts of a railroad system concerned with the 
physical operations and information flow necessary to the movement of freight. 
The intention was to build an integrated model of freight transport operations 
that would include terminal operations, over-the-road movements, and inter- 
change and industrial switching, and would allow the interactions between them 
to be studied quantitatively. 

These arc extremely ambitious terms of reference. It soon became apparent 
that there are no short-cuts to simulating a railroad and that a considerable 
amount of groundwork was essential before a. model as comprehensive as this 
one could be realized. After a year's experience a wiser croup established more 
modest objectives, the research was reoriented, and a few models of much more 



334 


rmiR B. WILSON 


limited scope and application were eventually produced [17, IS, 19], Qnlv 
surface of simulation potentialities has been scratched, and we arc still 
way from simulating a complete railroad system, yet this is one area in which th« 
future looks bright. 


6. OPERATIONS -RESEARCH STUDIES IN 
EQUIPMENT ALLOCATION AND DISTRIBUTION 


One of the most important problems in all railroading and one that \u< 
attracted considerable theoretical attention is the balancing of freight-ear 
inventories. Involved arc the provision, allocation, and distribution of equipment 
to meet shippers’ requirements. This problem has an important bearing on both 
railway operating and capital expenditures. The amount of capital is enormous 
and the utilization of equipment leaves much to be desired. 

The problem is one of imbalances which appear between the number and 
type of car available for loading and the number and type required. When 
the inventory of empty cars in a loading area is below an acceptable level, cmpt\ 
cars have to be moved at railway cost to adjust the imbalance. In the ease of 
box cars upgrading may be an alternative to moving or the business may be lost. 
When there is an ovcrsupply, expensive equipment lies idle and yards and siding* 
become congested, thus causing more delays. The tendency is, as nhwi\«, 
toward overprotection and the maintenance of a buffer inventory higher than 
needed. This again leads to unnecessary use of capital. 

This is one of those problems easy to formulate but difficult to do anything 
about. One difficulty has been the lack of timely information on car supply and 
demand at the different loading areas, but it is now in the process of being 
corrected by most railroads. Another, the inadequacy of methods of predicting 
car supply and demand, has resulted in late response, overcompensation for 
shortages, and wade swings in the car inventories. Students of industrial dy- 
namics will be familiar with the pattern. 

Various aspects of the problem have been tackled at different times, and a 
wide variety of techniques, which include allocation and inventory modeb, 
linear programming, and queuing theory, has been used. Beckman, McGuire, 
and Winstcn [20] describe an approach to the strategic problem of moving cars 
from one region to another in North America to balance seasonal fluctuations 


in demand. Feeney [21] in 1955 developed decision rules to control the empty- 
car inventory of the Southern Pacific Railway and to direct the intcrdiviVon 
moves required within the railroad to minimize car inventory costs. More 
recently, the Canadian National group studied the cause of maldistribution and 
identified the essential elements in organization, information, and procedures 
required for effective control. The findings have been successfully tested in a 
series of experiments, most of which have been profitable in their own rich? 
122, 23, 24, 25]. The Louisville and Nashville Railroad [26] has also recent!;* 
aoplied, with considerable success, a linear programming approach 
control of some specialized wood-rack cars. The problem is still far from 


but much of the debris has been cleared away. 



OPERATIONS RESEARCH IN NORTH AMERICAN RAILROADING 


335 


Some of the models merit a better fate than they have received* The Feeney 
study, for example, was a pioneering and imaginative approach to the problem 
but ran into difficulties because of the inability of the existing information and 
communication system in the sponsoring railroad to produce more than partial 
information within the time requirements of the model. Most railroads are now 
building improved information and communication systems which will assist in 
providing the timely data required for control purposes. There is still a need, 
however, for better forecasts and predictive methods, for without them a system 
with built-in reaction times as long as those experienced in car supply will 
inevitably tend toward instability. 

Very little operations-research work has been carried out on the assignment 
and allocation of diesel units and crews. The numbers here arc smaller and 
the control problem more tractable. Most railroads have, by trial and error 
methods, evolved procedures that result in acceptable utilization of the diesel 
fleet. 

A group from Purdue University and the Frisco Railroad applied linear- 
programming techniques to determine the minimum number of engines required 
to maintain a fixed traffic schedule and to minimize maintenance costs [27]. 
The method, with minor modifications and refinements, was used by the 
sponsoring railroad for some time but eventually fell into disuse. A similar 
approach was used to assign the mainline crews of the Union Pacific Rail- 
road [28]. This relatively small-scale problem included most of the difficulties 
found in larger assignments. The model produced feasible optimum assignments 
that allowed the number of crews to be reduced significantly. This work by the 
Purdue group seems to be the only operations research carried out in this field. 
Many problems remain unanswered, and once again the stakes are high. One 
diesel unit pays for a lot of research. 


7. A GLIMPSE OF THE FUTURE 

Possibly the greatest transformation that has taken place in North American 
railroading in the last decade is the emphasis placed on competitive customcr- 
oriemed service. The result lias been described ns a " quiet revolution” [29] 
in all areas of railroading. 

The successful railroads are not just keeping up with change, they arc 
causing it. This has manifested itself in a host of innovations: equipment 
tailored to customers’ requirements, competitive pricing and incentive rates that 
are almost revolutionary in scope and concept, unit and integral trains that 
benefit both railroad and shippers, containerization and Piggyback services, 
quality control of services, and an increased emphasis on marketing; but possibly 
most important for the future is the chance in attitude that has taken place. 

The railroads are now taking a broader and more fundamental look at their 
own operations and activities and at their places in the over-nil transportation 
picture. Attention of management is being focused increasingly on such problems 
as rationalizing the existing railroad system to determine what should be re- 
tained, what should be discarded, and what changes should take place in the 



OPERATIONS RESEARCH IN NORTH AMERICAN RAILROADING 337 

close to rationalizing the transportation picture, much more effort in operations 
research is clearly required. 

There arc some lessons for operations research in our review. Some of our 
sacred cows may have to be profaned; for example, arc sophisticated mathe- 
matical models as powerful in dealing with a complex system as we fondly 
advocate? Experience in railroading indicates that even when problems can be 
formulated mathematically rarely can they be solved without simplifying the 
real world to an unacceptable degree. It is perhaps of interest that most success- 
fully implemented operations research has had a basis of experimentation or, its 
modern counterpart, simulation. 

Gone also is our fond hope that the bright apostles of the new faith can 
meander into a complex industry and with a minimum amount of knowledge of 
the industry quickly demonstrate how bright they are and how -silly everyone 
else is. Experience in railroading demonstrates the opposite. Many attempts 
have not resulted in implementation merely because the people in operations 
research were too remote from the operations to understand or appreciate some 
of the essential constraints of the system* 

A good beginning has been made, however, and the apprenticeship in 
operations research can be considered over. There is in the industry at the 
present time a greater awareness of the very real problems facing it and an 
increasing interest in the potentialities of operations research to help solve these 
problems. Its prospects and conditions arc brighter than at any previous time 
and a new era could be upon us. The only question is — can it carry' the ball? 


9. REFERENCES 

[1] "The transportation system of our nation,” President John F. Kennedy's message 
to the United States Congress, April 1962. 

[2] \V, R. Van VooRltls. "Statistical Sampling in Accounting”, J. ORSA , 1, 259 
0953). 

[3] C. W. CHURCHMAN, "Sampling and Persuasion,” Operations Ret., 8, No. 2, 254 
(I960). 

[4] Proc. Rail tray Systems and Procedures Assoc . (now RSMA), 1953 Winter Meeting. 

[5] Proc . RSMA Seminar on Operations Research , February* 1954. 

[6] R. Chant, R. B. Brown, and R. O. Blanchard, "An Analysis of a Railroad Classi- 
fication Yard,” Operations Res., 3, No. 3 (August 1955). 

[7] E. Cook, "Prediction and Analysis of Classification Yard Performance,” Proc. 
RMS A Seminar, Fall Meeting, 1960. 

[8] E. Cook, "Operations Research as a Basis for Improved Design, Operation and 
Monitoring of Freight Railway S> stems,” UIC International Symposium on 
Cybernetics, November 1963. 

[9] E. M.\NSrii:u> and II. Wein, "A model for the location of a railroad classification 
yard,” Mana^enent Science, 4, No. 3, (April 195S). 

(10) C. E, I*aw, D. C. L\ar f and C. V. JacquemaIN. "Optimizing Performance in an 
Automatic Freight Car Classification Yard,” CORS J ,, 2, So. 2, (December 
1064). 

(H] C, E. l»W and D. C. Lieu, "Measuring and adjusting performance in an automatic 
bump yard,” Proc. RSMA Seminar, "Quality Control and Reliability Engineer- 
ing.” November 1964. 

[12) J. W. Burrows. "Developments in Terminal Operation Control Technology” 
and ” Technological Change and the Railways,” Northwestern University, 1961. 



338 PETER B. WILSON 

[13] R. M. Whitcombe, “The Integration of Terminal Car Distribution and Terminal 

Car Control ” (December 1963) and “Terminal Car Control in CN” (SentpmK 
1965), Internal CN publications. er 

[14] C. J. Hudson, “A Computer Simulation of CTC Railroad Operations ” A Cjuf 

62-RR-5 , April 1962. * 

[15] P. B. Wilson and C. J. Hudson, “Simulation of Over-the-Road Operations” 
UIC International Symposium on Cybernetics, November 1963. 

[16] W. P. Allman, “A Network Model of Freight System Operations,” Proc RSMA 
Seminar , March 1966. 

[17] F. A. Koomanoff and A. A. B. Pritsker, “Railroading as a Systems Concent” 

Battelle Tech . Rev. (March 1962). 1 

[18] F. A. Koomanoff, “The Development of Management Decision-Aiding Labora- 
tories,” UIC International Symposium on Cybernetics, November 1963. 

[19] “Simulation of Railroad Operations,” Proc. RSMA Seminary March 1966. 

[20] M. Beckman,. C. B. McGuire, and C. B. Winsten, “Studies in the economics of 
transportation,” RM1488, RAND Corporation, May 1955. 

[21] G. J. Feeney, “The Empty Box Car Distribution Problem,” Proc. 1st IFORS 
Conf.y 1957. 

[22] “Freight Car Distribution,” Proc. RSMA Seminar , July 1962. 

[23] J. W. Johnson, C. J. Hudson, and G. A. Cooper, “An Experiment in Operational 
Planning,” E.I.C. Engrg. J., 46, No. 4 (April 1963). 

[24] J. W. Johnson and E. M. Kovitch, “Freight Car Distribution; the Nature of the 
Problem,” CORS J., 1, No. 1 (December 1963). 

[25] A. D. Dingle, “Application of Queuing Theory to Freight Car Allocation,” 
presented at CORS annual meeting, May 1959. 

[26] C. D. Leddon and E. Wrathall, “Empty Freight Car Scheduling,” to be pub- 
lished. 

[27] V. B. Gleaves, T. E. Bartlett, and A. Charnes, “Cyclic Scheduling and Combi- 
natorial Topology,” Naval Res. Logistics Quarts 4 , No. 3 (September 1957). 

[28] H. R. Soyer, A. Charnes, and M. H. Miller, “Mathematical Programming and 
Evaluation of Freight Car Shipment Systems,” Naval Res. Logistics Quart., 4, 
No. 3 (September 1957). 

[29] Business Week, No. 1894, 122-124, 126 (December 18, 1965). 


LA RECHERCHE OPERATIONNELLE DANS LES 
COMPAGNIES FERROVIAERES N ORD-AMERIC AINES 

RESUME 

La participation de la recherche operationnelle, a la solution des problemes 
rencontres dans les compagnies ferroviaires nord-americaines, a ete etudiee et 
evaluee. Un bref releve historique des realisations obtenues durant la periode 
apres la guerre, a fait ressortir le peu d’enthousiasme demontre par l’industne 
envers les techniques conventionnelles de la recherche operationnelle. Lon 
note, que de nombreux travaux de recherches se situant exclusivement dans le 
contexte du secteur ferroviaire, ont ete effectues, sans qu’on se soit explicite- 
ment referer a la recherche operationnelle. La plupart de ces travaux portent 
sur les operations effectuees dans les coures de triage, terminus ainsi que sur la 
repartition et Taffectation du materiel roulant. Un bref expose sur quelques- 
uns des problemes pertinents indique que la complexity d’un reseau ferroviaire 



COMPUTER SIMULATION MODEL OF RAILROAD FREIGHT SYSTEMS 


339 


modcmc, milite centre Implication efftcace dc techniques mnthematiques 
claborees, Toutcfois, la mise en application de techniques rclativemcnt simples, 
s'est avercc particulieremcnt efncacc parce quo cclies-ci refletaient d’a vantage les 
realites. L’experimentation et la simulation out donne lieu a dcs resultats 
promettcurs, ces derniers avnnt etc acccptes avee empressement par la direction 
des compngnies ferroviaircs. Lc compte-rcndu sc termine par renonce de 
quelqucs rcccnts developpements qui sernient de nature a suscitcr I'interct dcs 
praticicns de la recherche operationnelle qui evoluent dans lc sectcur ferro- 
vinirc. 


A COMPUTER SIMULATION MODEL OF RAILROAD 
FREIGHT TRANSPORTATION SYSTEMS 

Unc Simulation stir Computer dcs Rcseaux 
Routicrs dc Transport de Merchandises 


William P. Allman 

The National Bureau of Standards and Northwestern University 
United States of America 


1. INTRODUCTION 

A railroad on which freight airs are moved efficiently benefits from reduced 
freight-car transit times, greater car availability, and lower per diem charges.f 
Therefore methods for evaluating freight-car movement under alternative 
scheduling and associated policies are significant and arc especially important 
for planning new railroad systems that result from mergers and acquisitions. 
This article describes a computer model that may be used to simulate some basic 
railroad freight scheduling and associated policies. 


2. BACKGROUND 
2.1. The Railroad Setting 

A major objective of a railroad enterprise is to accommodate demands for the 
movement of freight cars between points on the line. Demands arc of two 
types: regular demands which originate at known points in time with some 
degree of assurance, and irregular demands which are unexpected. A demand 

i Bro?dI\ speaking, per diem chat^cs arc emts incurred by a railroad for the me 
of freight ears owned In other nulroad*. Such charge* arc bated on the number of 
day* a car t * on the railroad and iti depreciated 



340 


WILLIAM P. ALLMAN 


originates when a freight car requires movement from a point called its demand 
origin. The car remains a demand until it reaches its demand destination De 
mand origins and destinations may be points of loading and unloading on the 
railroad or interchange points where the railroad connects with other railroads 
Demands include both empty and loaded freight cars, for empty cars which 
require movement just as loaded cars do typically represent a significant per- 
centage of the car movement on a railroad.f 

A railroad may contain several thousand demand origins and destinations 
and numerous switching yards in which cars can be switched from one train to 
another. Local trains move cars between demand points and switching yard* 
but major car movements are represented by “ over-the-road” trains that 
move cars between major yards. For purposes of planning over-the-road car 
movements a railroad may be described as a network of nodes (major yards)]; 
and links (railroad lines), and demand origins and destinations may be con- 
sidered as the nodes in the network. 

In railroad terminology a train “picks up” or “takes” a car, “hauls” a 
car over trackage, and “sets off” (drops) a car. Motive -power tonnage capacity 
and link topography limit the number of cars a train may haul over a link. 
For many origin-destination pairs no single train travels from the origin to the 
destination. Thus, in completing its transit, a car may have to travel on several 
different trains set off by one train and taken by another at intermediate yards. 

Cars undergo time-consuming servicing and inspection operations in the 
yards and are sorted (classified) into categories called “groups” which are 
commonly identified by (a) traffic class and (b) that future yard to which cars 
in the group are to be hauled before being reclassified. Trains departing from 
yards take cars from groups assigned to them. Sorting at yard i is done according 
to a set of rules called the yard grouping policy, which may be represented by 
the matrix (?*= |j where gj : j is that group into which “cuts” of traffic 
class k and destined to yard j are sorted. Such policies are normally time- 
invariant (i.e., the matrix is the same for all points in time), although the sorting 
of cars into groups at a given time could conceivably depend on how soon 
thereafter trains could take cars from the various groups. 

Because of capacity limitations, a train may not take all cars in a group 
assigned to it; however, all cars taken from a group may be considered as that 
group aboard the train. Thus it may be said that trains haul groups rather than 
individual cars. A group is set off from a train at some future yard at which (a) 
the group is “ broken ” and its member cars are resorted or (b) the group departs 
intact on another train. This latter situation is called “ pregrouping ” and is done 
to relieve congestion at critical yards and facilitate making tight train connections. 
In actual practice pregrouping is the exception rather than the rule; however, its 
occurrence has important effects on yard operations and freight-car transit 
times. 

f Decisions governing the allocation and distribution of empty freight cars are 
beyond the scope of our consideration. From our viewpoint such decisions determine 
when and where an empty-car movement demand will originate. 

X The terms nodes and yards are used interchangeably. 



COMPITFR StMt L VTION MOPF1 Or RAIIROU) FRTIGHT SXSTFMS 


341 


2.2. Some Basic Questions 

In this article wt arc concerned with over-thc-road freight-car mov cments 
on ovtr-the-roid trams that travel between major vards To accommodate 
movement demands bistc operating politics arc established In a railroad with 
respect to tram routes, tram capacities, and tram schedules and the assignment 
of groups to trains for hauling. Although unscheduled trams arc often needed 
to accommodate \olummou^ irregular demands, regularh -scheduled trains 
must exist m order thu a railroad mav plan and allocate resources efficient!} 
and attempt to satisfv customer desires for on-tune and reliable deliveries. The 
establishment of preplanned tram schedules includes answering the following 
basic scheduling and sorting questions of railroad freight operations: 

1. When and where should regularh scheduled trains run 5 

How mam trams should the railroad run over each link 5 
What should the routes of individual trains be? 

At what times should trams be scheduled 5 

What should the hauling capacities of individual trains be? 

2 For tacit card what should the grouping pohc} be 5 

Whit sorting classifications (groups) should exist at each }ard and b) 
whir rules should cars of various destinations and traffic classes be 
classified into them? 

3 Tor each link of a train’s route, what cars should he assigned to the 
train for hauling? 

At each )ard of a train’s route, from what groups should the train be 
assigned to take cars? 

At what future }ard should the cars taken from a given group be set 
off from the tram? 

The difficulty of answering these interdependent questions is related to the 
degree of connective) and sure of the railroad network. Answers must be 
obnintd in accordance with operational objectives of the railroad enterprise.-f 
Tor a railroad tin re cxids no single measure bv which to evaluate the numerous 
pltcrnitive sorting and scheduling policies that mav be employed against hkclv 
p uterus of traffic demands Measures of evaluation, however, mav be expressed 
m terms of kc\ operating performance measures such as freight-car transit 
times, delays due to congestion .at vards, number of trains, train lengths, )ard 
volumes, total cur-days, md operating com*. The purpose of this article is to 
describe how digital-computer simulation mav offer a fruitful way of experi- 
menting with potentid scheduling and sorting policies and investigating the 
rutlroul operating performance that mn be expected to result from employing 
selected pobtns agunst specified demand traffic patterns 

* Tl c street of ra Voa J opt.r*.uons p-nfo -nance js one of grew semmvarv and 
j*kx acmes' xrd differ aecn-d re to t^c ci- »\\rcrsh?p pom or £cogn?phical cbir^ctrr- 
nticx eon «- motion of the r dividual ru bov* 



342 WILLIAM P. ALLMAN 

2.3. Simulation of Railroad Operations 

Only a few railroad simulation studies have been undertaken. Most of them 
focus attention on a particular subarea of railroad operations rather than on a 
total network. Examples include the manual simulation of classification yard 
processes by Crane, Brown, and Blanchard [4], the simulation of centralized 
traffic control (CTC) installations by the Operational Research Branch of the 
Canadian National Railways [7], and the four computer models developed by 
the Railroad Systems Research Group of the Battelle Memorial Institutef [9] 

Relevant nonsimulation network-oriented studies of railroad operations 
include a mathematical programming approach by Charnes and Miller [3], 
Feeney’s investigation of the distribution of empty freight cars between divisions 
of a railroad [5], the comprehensive exposition on railroad operations and treat- 
ment of specialized problems by Beckman, McGuire, and Winsten [1], and 
BoldyrefFs flooding technique for estimating the maximal steady-state flow of 
traffic through a railroad network [2]. These works illustrate numerous com- 
plexities inherent in railroad operations but do not consider time dependencies 
or the basic scheduling and sorting policies from a total-network viewpoint. 

It appears that at a total-network level a railroad is too complex to be 
modeled analytically. Experience is needed to determine how well total-network 
railroad operations may be simulated by using digital computers and also the 
degree to which newly developed computer simulation languages such as GPSS 
[6] and SIMSCRIPT [8] may aid in the process. Construction of a railroad 
network model was first attempted (unsuccessfully) with GPSS. The model 
now described has been successfully constructed in SIMSCRIPT. 

3. THE RAILROAD NETWORK MODEL 
3.1. Purpose and Scope 

The purpose of the model is to serve as a tool with which some basic oper- 
ating policies of freight operations may be investigated at a total-network level. 
The model is designed to permit comparisons of alternative policies for specific 
railroad systems. In addition to the basic scheduling and sorting policies de- 
scribed earlier, other policy questions that may be investigated are the following: 

1. Running a few long trains versus many short trains. 

2. Concentrating classification activities at selected yards rather than 
spreading them over the entire network. 

3. Having selected yards switch only during specific work shifts. 

4. Having a large or small amount of pregrouping. 

The model simulates n days of operation of an iV-node network. Major 
inputs are train routes and schedules, yard grouping policies, train group 

f The Battelle models separately deal with: (a) motive-power assignment and utili- 
zation, (b) single-track over-the-road train movement, (c) classification yard functions, 
and (d) diesel locomotive servicing functions. The first model is the only one that con- 
siders a spatial railroad network ; the others consider individual functional activities o a 
railroad. 



COMPUTER SIMULATION MODEL or RAILROAD FREIGHT SYSTEMS 343 

assignments, and freight-car movement demands. Several concepts important 
to railroad operations planning are represented only through inputs: 

1, The model does not consider possible trackage restrictions on train 
movement such as single-track links or siding lengths; nil specified schedules 
arc assumed to be feasible with respect to such restrictions. 

2. Rond engines are not directly represented in the model; however, a 
train-length capacity may be defined for each link of a train’s route to reflect 
train-length limitations imposed by presupposed motive power. 

3* Although it is recognized that demands on a railroad for car movement 
are a function of provided schedules and services, demand inputs to tite model 
are assumed to be fixed; that is, the model does not adjust demands to reflect 
changes in schedules and associated policies. 

At any time a large railroad may have thousands of freight cars of different 
traffic classes on its tracks. To investigate railroad traffic flows car movement 
may be examined in terms of sets of cars that travel together rather than in terms 
of individual cars. In the model cars that travel together arc aggregated into 
11 cuts,” that is, sets of cars that originate as demands at the same yard at the 
same time and have the same destination. Freight -car traffic classes arc not 
recognized. A cut is the basic unit of freight-car flow in the model nnd is not 
divisible into those cars it represents. 

The model 1ms two forms. In its “extended” form processing rates of cars 
through yards is a function of the availability of physical and personnel re- 
sources (e.g., operating facilities, switch engines, nnd work forces) essential 
to the accomplishment of yard operations. In its “basic” form it is assumed 
that such resources arc unlimited. 

3.2. The Flow of Cars nnd Trains 

Demands on the railroad may originate at any time at any yard. A demand 
consists of a cut of xyt cars originating at yard i at time / and requiring movement 
to yard j. Demands may be completely prespecified (deterministic) or generated 
probabilistic-ally. For probabilistic demands, the probability distribution Fu 
yields a total quantity of cars originating at yard i at time /. Each originating 
car is assigned a destination from the probability distribution Gu * f All cars 
destined to the same destination are aggregated into a single cut. The basic 
cycle of a cut is to 

(a) be created as a demand on the railroad at its origin yard; 

(b) he processed through inbound operations at the yard; 

(c) terminate if the yard is the cut’s destination — otherwise the cut is 
clarified into a group at the yard; 

(d) be reserved to be picked up by a train that takes cuts from the group; 

(e) be processed through outbound operations at the yard; 



344 


WILLIAM P. ALLMAN 


(f) be picked up by and depart on the taking train; 

(g) remain aboard the train until that future yard at which its group is 
set off; 

(h) be set off from the train. Go to Step 2 and continue the cycle. 

Steps 2 through 5 do not occur for cuts pregrouped through a yard. In the real 
world pregrouped cuts are expedited through yards in order to make connections 
to outbound trains. To represent this in the model pregrouped cuts undergo a 
single “expediting” operation, after which they are eligible for connection. 

A train travels through the network according to its route and schedule 
(which is met unless the train is delayed waiting to pick up cuts at a yard) 
Cuts are picked up and set off in accordance with the train’s Take List which 
specifies, for each yard of the train’s route, the groups at the yard from which 
the train is assigned to take cars.f Cuts taken (subject to train capacity) are 
reserved for the train before its scheduled departure from the yard (at its “ cut- 
off ” time) so that outbound operations prerequisite to their pick-up by the train 
may begin.J The basic cycle of a train is to 

(a) reserve cuts to be taken at its first yard; 

(b) pick up the reserved cuts after outbound yard operations have been 
performed on them; 

(c) pick up pregrouped cuts that are to connect to the train; 

(d) depart and travel over the link to its next yard; 

(e) reserve cuts to be taken at its next yard so that outbound yard operations 

to be performed on them may begin; 

(f) arrive at its next yard and set off appropriate cuts; 

(g) terminate if the yard is its destination. Go to Step 2 and continue 

the cycle. 

3.3. Yards and Yard Operations 

Since yards are so important to total-network performance (particularly 
to the times at which cars are eligible for movement), it is necessary to consider 
yard operations to some extent in any railroad network model. Although 
yards differ with respect to physical layout, resources, and required operations, 
there are basic similarities that permit the model to contain a standard yard 
structure applicable to every node of the network. Fixed parameters govern 
the number and sequence of operations performed at each yard and the time 
necessary to perform each operation, hence the availability of freight cars for 
movement out of yards on trains. 

t Each entry of a train’s Take List for a given yard contains (a) a group from which 
the train is assigned to take cuts, (b) the future yard at which cuts taken from the group 
are to be set off from the train, and (c) a code indicating whether the cuts taken are to 
be pregrouped through the yard at which they are set off. 

X The model does not consider cut tonnage; train capacity is in terms of number of 
cars. As stated earlier, a train may not take all cuts w-aiting in a group assigned to it be- 
cause of capacity limitations ; however, all cuts taken from a given group may be con- 
sidered to constitute that group aboard the train. 



COMPUTER SIMULATION MODEL OF RAILROAD FREIGHT STSTEMS 345 

The standard yard structure of the model requires that an ordered set of 
operations be defined for each yard. What each defined operation represents 
is specified by the analyst; that is, operations may represent bleeding, inspec- 
tion, trimming, or makeup. One operation must be the classification operation. 
From a yard -operations viewpoint there arc four types of car. Each type under- 
goes a different ordered subset of operations. Cars on which a sequence of 
operations is performed simultaneously are aggregated into sets called segments. 
The four types of car and their corresponding segments are the following: 

1. Cuts originating as demands at the yard within a specified time interval 
arc aggregated into an input segment. 

2. Cuts set off from an inbound train, and which are to be classified at 
(i.c., not pregrouped through) the yard, arc aggregated into a set-ofT segment. 

3. Cuts set oft from an inbound train, and which arc to be pregrouped 
through the yard, are aggregated into a pregrouped segment. 

4. Cuts that have been reserved (from groups) to be taken by a specific 
outbound train are aggregated into a reserved segment. 

The time required for a segment to undergo an operation differs by yard 
and operation as a function of segment size ( the number of cars in the segment). 

In the extended form of the model any operation at any yard may require 
the use of a facility and/or a force. Different facilities and forces may be defined 
for each yard. A facility is defined as a “fixed” resource that may not move 
about the yard and which is used solely by the operation with which it is 
associated. A force is defined as a resource that may move about the yard and 
be used in different operations. What each type of facility and force represents 
is specified by the analyst, that is, facilities may represent a car servicing area 
and the yard hump, whereas forces may represent switching engines and car 
inspection teams. For each yard different levels of each type of facility and 
force arc specified for each of three contiguous work shifts. Facilities and forces 
are assigned to yard operations on a first-need, first-served basis. An operation 
on a segment does not begin until a facility and/or force of the typc(s) required 
by the operation are available to the operation. Segments delayed at operations 
because of unavailable resources wait in FIFO queues associated with the reason 
for the delay. 

3.4. Operating Costs 

Costs are considered in the model as follows: 

1. In the real world costs of hauling cars over links actually depend on 
the mothc power utilized, speed of travel, link distance and topography, ton- 
nage hauled, and crew costs. In the model, hauling costs are taken to be a 
stepwise nondecreasing function of train length. Coefficients of the function 
may differ for individual links. Movement costs arc accumulated for each 
link and for the total network. 

2. In the real world yard operation costs depend on a multitude of factors. 
In the model, yard costs are a function of yard switching volumes plus the 



346 


WILLIAM P. ALLMAX 


costs of forces employed at yards. (Because facilities are feed resources costs 
are not associated with them.) The cost of classifying a car at a yard is different 
from the cost of pregrouping a car through the yard. Switching costs are 
accumulated for each yard and for the total network. Costs of forces are also 
accumulated. 

Although these cost considerations require refinement and extension to be 
useful for true costing purposes, it is believed that they provide a basis for 
comparisons of total-network operating policies. 

3.5. Summary of Inputs and Outputs 

At a synoptic level a simulation model may be summarized by its inputs 
and outputs. Inputs to the railroad network model consist of the following: 

1. A network description which includes a network transition matrix 
distance matrix, and hauling cost information. 

2. Train descriptions for each train which include the train’s schedule, 
route, days the train runs, travel time over each link, cutoff times, scheduled 
stopping time at each yard, permitted train length over each link, and Take 
Lists. 

3. Freight-car demand descriptions which include a list of input cuts 
Xijt , and/or probability distributions F it and G it from which cuts are generated. 

4. Yard descriptions which include the number and types of operations 
at each yard through which each segment is processed and operation times as a 
function of segment size. Switching costs must also be provided. If the extended 
form of the model is used, the following information is needed for each yard: 
work shift starting times, facility and force levels during each shift, facilities 
and forces required by operations, and the lengths of time facilities and forces 
used by operations. 

5. Control specifications that define control parameters for the individual 
simulation run as w'ell as output options. 

Major outputs include “progress notices,” wriiich may be printed whenever 
selected events occur, and a system summary. Notices of other important 
circumstances (e.g., w r hen a train is delayed at a yard because the cuts it is to 
pick up have not completed outbound yard operations) may also be obtained. 
The system summary describes (at prescribed intervals) the total simulated 
network and displays values of various statistics since they were last reset. 
Figure 1 show's portions of the summary output for the basic form of the modeL 
If the extended form is used, a facility and force utilization report provides, 
for each type of facility and force employed at a yard, resource utilization 
in terms of car and segment volumes, plus statistics relative to delays associated 
with the type of resource. 

3.6. Computer Considerations 

The structure (w'orld view) of SIMSCRIPT is convenient for constructing 
network-flow" models; space does not permit a discussion of useful relevant 



COMPUTER SIMULATION MODEL OF RAILROAD FREIGHT SYSTEM: 


*» ti*j j 

.eit 

cimif iomc^ 

tarn* 

*5**4 


-J« *4 



t*n 

t<M 

t**i 


c*ti 


U*1 

t**i 

mm 

'S* 1x1 

tutu; 

t*M4l 

,**t i' *(» 

CMC 

*** t* 

lflHf 

Cl*il 


(1 4 

Tl** 

« v 

ltf{» 

ertr 

Tft« 

I Ml 

l«5 

413 


!•« 

c 


♦ i’- 

4 

> 

4 

1 

J /Cl 

wee 

4)1 

tit 

111* 

c 

III 

ll; 


11 

I* 

1 

1 1*1 

wee 


lit 

lilt 

e 

*T« 

n? 

/ 

4 

4 

J 

« M) 

Wco 

*1 J 

H3 

w»* 

e 

*11 

m 

n 

11 

11 

c 

1 III 

l/~0 

4M 

tl« 

It t* 

e 



e 

It 

JC 

e 

t 4*1 

wee 

*1* 

4*4 

W»* 

c 

*« * 

*TJ 

« 

1 

4 

i 

J /it 

w:o 

111 

111 

IM1 

e 

i*i 

7/t 

t 

1 

4 

i 


fc*tL* 

Cl'S 

1CW4 

• **c. 


C*lC- 1 Cist- 1 

♦ 

3 

* 

5 

* 

3 

t 

4 

*C* 

cm 

«Ut 


tm e>i« iw 

140 

14 

*4 

• 4 

41 

11 

i 

i 

31 


1 14 

e.itt 


un rvr» w 

4 

t 

3 

4 

* 

t 

i 

s 

23 


11/ 

CUM 


*« C T«M*t C.Cf/ 

c.«-t 

I.C-*/ 

e.r*/ 

1.22 1 

c.u* 

I.4M 

t 

* 

44 


J 1 * 

e. ?i* 


-4* 3 • 1 1 ft* C.ltl 

c.ll) 

l-**t 

(.Ml 


1.104 

4.171 

i 

i 

o 


»4* 

e.ut 


*14 t*l*4* 0.C43 

C.414 

c.t/l 

C.«?1 

c.t w 

c.m 

t.t*/ 

i 

« 

4< 


t*t 

e./rc 


e» to- / w .i- i 

? 

* 

* 


* 

1 

i 

i 

US 


111 

c.n* 


v*i: c*ri i/i 

13* 

III 

ui 

2! 1 

ir* 

*Ct 

; 

i 

*1 


1 *c 

c.i:i 


tsit m * * 

w 

4 

• 

3 

• 

1 


3 

4« 


wt 

f 115 


4»t T *1 1*1 C.IT4 

c.*»: 

C.ltl 

e.*:* 

C.lfO 

C.l*» 

l.tet 

/ 

4 

M 


*!■ 

c.w* 


**» t<f irl 1.1T1 

c.ltl 

i.m 

0-133 

e.tit 

1.144 

l.Ul 

) 

1 

fl 


in 

e.wo 


M‘ mut 0.54/ 

c.tu 

C.MI 

C* 

C.*/5 

C-tCt 

r.ti* 

/ 

* 

w 


w; 

e.itt 


c* jo- j r wt- i 

2 

3 

* 

1 

t 

3 

/ 

I 

15 


lt5 

C.ltl 


C**1 </tl* 14/ 

1C 3 

111 

• 4 

1/1 

if 

ICt 

i 

1 

41 


?tc 

fU« 


Cl *1 C*f» t 

• 

»/ 

1 

f 

1 

t 

i 

1 

41 


Wt 

C./l* 


4*C T*7Wf t.Ctf 

C.l/f 

C.C *3 

c.eti 

C.M 3 

/.w- 

J.C*/ 

5 

4 

• 1 


t*1 

C.33* 


»»* 1 • 1 1 *t \.1*J 

i.m 

c.n* 


I.M) 

t * * * 4 

i.54> 

1 

1 

41 


:*• 

e.jrt 


► J4 343**1 C.lt* 

C.4/1 

C.C41 

0. 

«.4lt 

C.fl/ 

c.ie* 

1 

4 

TC 


itc 

cwti 


C«1C* * CfST- 1 

j 

3 

4 

* 

ft 

1 

3 

1 

It 


ur 

e.?7« 


C4»i tr#r» o 

ii* 

*4 

l It 

1 4 / 

m 

it 

4 

] 

1*1 


lit 

e. 


un e*f» c 

* 

* 

1? 

4 

« 

1 

4 

3 

4C 


11* 

C.l*/ 


MYZ 7*1 I ft* C. 

C.431 

C.4T4 

C.C*/ 

e.*-4 

C * * 44 

e.ut 

4 

1 

M 


ne 

c.;w 


*M r*ftn o. 

I.C2I 

1.4T4 

c.m 

0.5 If 

MM 

i.:u 

4 

1 

It 


wt 

e.fii 


+r* f.ftf 

r.*// 

f.MI 

C.C* J 

C.MI 


c.ttr 

4 

4 

it 


14? 

c.:co 


cm* 1 Ctw- l 

/ 

3 

4 

4 

1 

» 

4 

T 

31 


WO 

o.xti 


t»*s cw« ue 

ic; 

3e 

11* 

itl 

*•- 

wt 

1 

1 

tes 


;c» 

C. JTO 


ctis me * 

r 

7 

4 

li 

3 

« 

5 

; 

33 


we 

c.u; 


4»C r*3j*t i,;u 

c.n* 

C.1M 

C. 1/7 

C.Cf/ 

c.tv* 

c.m 

1 

i 

IT 


lit 

C.Jt4 


Ml 3 * 3 i *t 1.7CI 

t.l«7 

1.414 

UC*7 

t.ll* 

I.C/I 

c.tn 

1 

4 

1 


JM 

e.;i) 


*14 T*Tl*C c.tn 

C.MI 

C.Hl 

0.4^7 

C.C41 

e.**t 

C.1/J 

1 

4 

43 


lit 

c.ni 


e«ic- 4 CfST- t 

/ 

3 

4 

4 

* 

7 

1 

1 

;* 


lit 

MM 


c *t i mi e 

4> 

ft 

IC» 

lit 

wt 

13 

i 

l 

:« 


/ci 

e. 


ClU cm* c 

T 

; 

f 

V 

w 

- 

4 

J 

ii 


!•/ 

ewM 


evi t»i i * t e. 

£.140 

l. ITC 

C.Mt 

C.413 

e.*“4/ 

C.IM 

4 

3 

»i 


Ui 

e.;« 


*<* t*tr*f c. 

I.4W 

t«MI 

Ui"* 

I. *41 

e. nt 

t.C/t 

4 

4 

41 


1J4 

C.W'' 


414 3*3 | *f 4.444 

C.ltl 

c.*w 

0*431 

e. */1 

£.(*' 

C.iM 

4 

4 

M 


we 

c./t: 


cm* t cew- i 

J 

» 

- 

4 

I 

1 

* 

1 

1* 


W1 

c.w* 


C**1 CM* UC 

$/ 

** 

It" 

III 

3C 

, *. . 

7 

1 

wt 


US 

0.145 


ci’i c»rt « 

* 

t 

t 


* 

If 

T 

/ 

11 


see 

C » / 4/ 


m hum i*4«i 

1.3*4 

t.l.M 

f **. 

r 

c.»»r 

0.**’ 

J 

3 

41 


1*1 

o.ua 


*41 T»l|*l 1*131 

1.314 

1*4*4 

S..M 

C -4* 

1 ( i 

e. w« 

1 

-4 

it 


sn 

c.tt* 


• ]4 3* t I *(, 1.04/ 

C .**4 

£.1*1 

e. 

* 

*.-*! 

Z.r% 4 

1 

1 

1C 


wt 

e.i*e 









1 

* 

n 


i» i 

0,7/1 














0.1 1 

1*4 1 4 1 

►Ctl 









j 

j 

Cvf* 

<»l» 

rut* 

CC33S 










* 

4/3 

41 

K 

.MM 











415 

1C 

4 

uo 









J 

* 

Mi 

4C 

4 

}T* r 









J 


H4 

<4 

» 

*J3* 









Jf 

* 

JM 

4* 

4 

» 15 









J 


4<"4 

It 

4 

1 I- * 











m 

n 

4 

3 *| ■* 









J 


re* 

47 

4 










4 

* 

*C4 

43 

5 

*»M 










T 

*14 

31 

’ 

IC/C 









* 

4 

143 

It 

1 

;t«e 










T 

111 

4* 

5 

1 1 4 I 











Ml 

31 

4 

I'-t? 









J 

T 

*4 

• 

3 

JC/C 









j 

n 

• 54 

47 

t 

*37* 









T 

4 

«1 

* 

• 

1**2 







Fyrurc 1. Fnn;m of fUTnur; output. 

features of the hncuace. The railroad network model consists of four SIM- 
SCRIPT program*. illustrated in Future 2. The preanalysis program analyses 
input data for Jncical inconsistencies. The cut-gcncrathn program prepares a 
tape time file of cm input? from freight -car demand description*, The third 




348 


WILLIAM P. ALLMAN 


Simulation Run Route, Schedule, 
Specifications and Policy Data 



Figure 2. Diagram of computer programs constituting railroad network model. 


program is the main simulation program, which may optionally output a 
history tape containing records of each cut and/or train movement. This tape 
may be used as input to a postanalysis program that generates statistics from 
train and cut histories. 

Computer storage ior descriptions of trains, cuts, and segments and notices 
of future SIMSCRIPT events is allocated dynamically by SIMSCRIPT. 
Table 1 describes train, cut, and yard configurations for some executions of 
the basic form of the model by using the SHARE version of SIMSCRIPT 
on the IBM 7094 computer. The computer running times given do not include 
execution times for the preanalysis or cut-generation programs. 


Table 1 

Railroad Network Model Computer Running Times 




No. 

Average No. 

Average No. 


Computer 


No. of 

Cuts 

of Yards in 

of Entries 

Length of 

Running 

No. of 

Daily 

Input 

a Train’s 

in a Train's 

Simulation 

Time 

Yards 

T rains 

Daily 

Route 

Take List 

Run (days) 

(minutes) 

7 

12 

294 


5.5 

— 


7 

24 



5.5 


6.4 


85 

Hil 

3.4 

6.4 


4.4 


The major restriction limiting the size of a railroad system that can be modeled is 
computer memory required for (a) descriptions of cuts and (b) SIMSCRIPT event 
notices, each of which requires four 36-bit computer words. The effective representa- 
tion of large traffic volumes requires a greater storage capacity than that provided by the 
7094; hence it is noteworthy that computers with larger memories and SIMSCRIPT 
capabilities are becoming available. 













COMPLTLK SIMULATION MODI’! OI RAH POM> l HEIGHT SYSTEMS 


349 


4. USING THE MODEL 

4.1. Tactical Planning 

Starting conditions of a simulation run may range from one extreme of 
4< cmpt)-and-idle M conditions to another in which the railroad is fully oper- 
ational and the simulation happens to start "now.” In the model the problem 
of starting conditions (i.c. t overcoming the artificiality introduced by the abrupt 
start of the simulation) is handled in the traditional manner of excluding results 
of an initial portion of a simulation run from consideration. No general cri- 
terion exists for determining when measurement of the actual simulation should 
begin; the model includes time parameters that govern when 

(a) cuts first enter the system (and are not hauled); 

(b) trains first enter the system (and travel empt}); 

(c) trains first haul cuts; 

(d) simulation measurement begins; 

(e) accumulated statistics arc reset. 

Since the variability associated with the outputs of even simple Monte 
Carlo simulation models is often discouragingly large, we must expect that 
outputs from the railroad network model will be quite diverse. 

4.2. An Application of the Model 

As of this writing, preparation is underway to apply the model to data 
representing a major U.S. railroad. The actual railroad is being represented 
as a 20-node network through which approximately 85 regularly scheduled 
trains run. Actual freight-car traffic demand data have been collected over a 
10-day period and transformed to represent movements between the 20 yards. 
For each of the 380 origin-destination combinations data were appropriately 
reduced to represent the number of cars originating as demands on the network 
during each hour of each day. For each yard input points in time and their 
associated input quantities were selected from an examination of these hourly 
tabulations. Destination probabilities for each input have been taken as the 
percentage of total cars bound for each destination. It is significant to note that 
development of these* time-dependent origin-destination demand data (repre- 
senting more than 150,000 freight cars) is in itself a major data -processing ta^k 
that could not have been reasonably accomplished without the existence of a 
computerized ear-movement reporting sx stern. 

For this firM application of the railroad network model to real data complete 
validation of the model is not expected. The abstractions of the model from the 
real world arc substantial, and it is anticipated that this current version will 
be more of a pilot model than an actual productive tool. 

4.3. Cost and Benefits 

lAptnmcc tint permits an appraisal of the economic justifiability of the 
mm!il in terms of its enM and Inmcfits lacking, hm it is dear that large po- 
tential savings in railroad operations arc possible (c.g.. some railroads absorb 



350 


WILLIAM P. ALLMAN 


large per diem deficits that may be reduced with modest improvements in 
freight -train scheduling to permit cars destined to other railroads to leave the 
railroad before rather than after midnight). The ability to investigate the im- 
plications of total network policies by experiment rather than by actual opera- 
tions can clearly contribute significantly toward realizing savings in operating 
costs. b 


5. SUMMARY 


5.1. Limitation of the Model 

Although the model includes important aspects of a total railroad network 
it disregards many factors that are significant in railroad operations. Among 
the items that should be considered for addition to the model are the following: 

1. Freight-car traffic classes (as a function of type of car and commodity 
if loaded) and car movement priorities. 

2. Recognition of per diem costs. 

3. Recognition of priorities for handling segments (i.e., incoming trains) 
at yards. 

4. The ability to create extra trains analogous to the running of u extras” 
in the real world. 

5. Probabilistic travel times and yard operation times to reflect the non- 
deterministic nature of the railroad environment. 

6. More sophisticated rules to govern which cars are to be picked up at 
yards by specific trains. 

A greater ability to draw inferences about real railroads would result from a 
network model that contained detailed representations of individual yards and 
links. Such an all-encompassing model, however, requires strenuous computer 
programming efforts and is feasible only with computer speeds and memories 
superior to those commonly available today. 

5.2. Summary 

The model described in this article is believed to be the first of a railroad 
network in which trains and cuts “flow” through the network as time advances 
and in which yards and their operations are considered. Although SIMSCRIPT 
is no panacea for modeling a system as complex as a railroad, its world view 
permits construction of the model with much less effort than would be required 
by machine-level or traditional scientific programming languages. Possible 
applications for the model are the following: 

1. A tool for predicting what total railroad operating performance will 
be if specified operating policies are implemented against specified demand 
traffic patterns and to compare the associated costs of alternative sets of policies. 

2. A training device for railroad operating management. The model 
may be used to increase management’s understanding of the system-wide 
implications of individual and local operating decisions. 



COMPUTER SIMPLATION MODIX OF RAILROAD FREIGHT SVSTEMS 


351 


The potential contribution of computers and simulation to the planning 
and analysis of total-system railroad operations remains to be established 
and is severely dependent on future hardware and software costs and charac- 
teristics. Experience in the development and application of models such ns 
I hat described is needed to provide experience on which an appraisal of the 
benefit of simulation to total-network railroad analysis may be made. 


6. ACKNOWLEDGMENTS 

The railroad network model is currently being supported in part by the North- 
east Corridor Transportation Project of the ILS. Department of Commerce. 
T wish to acknowledge various forms of support given to earlier versions of the 
model by The Transportation Center at Northwestern University, the Inter- 
national Business Machines Corporation, and the New York Central Railroad 
Company, Special thanks are due to Messrs. A. S. Dang, K, Marsch, R. Nndcl, 
and I£. Rovner of the New York Central for guidance in model definition. 


7. REFERENCES 

[1] M. IJrcKMAN, C. McGrmi , and C. Winston. Studies in the Economics of Trans- 
portation; Yale University Press, New Haven. 1955. 

[2] A. BounTtriT, M Determination of the Maximal Steady State Flow of Traffic 
Through a Railroad Network,** Operations Res. (November 1955), 

[5] A, Cl! mines and M. Miu.Ht, **A Model for the Optimal Programming of Railway 
Freight Train Movements** Management Science (October 1 956). 

[4] R. C*R\Nt, F. R nows-, and R, Blanchard, "An Analysis of a Railroad Classification 
Yard/* Operations Res. (November 1955). 

[5] G. Fitnkv, “The Empty nox Car Distribution Problem/’ Proc. 1st Intern. Conf. 
Operational Res. t English Universities Press, London, 1958. 

[6J General Purpose System t Simulation HI Users Manual , 1RM Corporation; White 
Plains, Xcu York, 1965. 

(7] C. Hudson, *'A Computer Simulation of CTC Railroad Operations/’ ASME 
Paper 62-RR-5, ASME-AIEK-EIC Railroad Conference, Toronto, April 1962. 

fS] H. Makkowitt, R- Hai’snfr. and H. Karr, "SIMSCRIPT. A Simulation Pro- 
gramming Language/* The RAND Corporation; Santa Monica. Cnlif., 1963. 

J9) Unpublished reports of the Railroad Systems Research Group of The Ilnttcllc 
Memorial Institute. Columbus, Ohio. 


UNI* SIMULATION SUR COMPUTEUR DES RESEAUX 
ROUTIERS DE TRANSPORT DE MARCHANDISES 


r£SUM£ 

Un chetnin dc fer prut ctrc consider*: commc dc rcscau dc ntruds (garcs) ct 
connections (votes); dans cc systeme il y a dcs dernandcs nu mouvement dc 
fourpms depcnd.mtcs des hctircs. Appnrtcnant nu tnouvement tics fourgons 
sant les questions suivantes: 



352 


WILLIAM D. CASSIDY 


• — Quant et avec quelle destination faut-il mouvoir des trains reguliers ? 

— Quel est le principe de la classification (suite) des fourgons dans les 

gares ? 

— Quels sont les fourgons assignes en train pour chaque connection? 

II faut repondre a ces questions interdependantes simultanement et d’accord 
avec les principes generaux de la compagnie de chemin de fer. De temps a 
temps il faut modifier ces politiques dependent des demandes du trafic. 

Le travail ici presente concerne un modele de simulation qui permit d’ex- 
perimenter avec des politiques alternatives pour Poperation des fourgons et 
en vue du systeme total. Le modele est construit avec la langage SIMSCRIPT 
(pour simulations). Le input contient les demandes (dependantes des heures) 
des fourgons pour des routes fixes, routes et horaires des trains, politiques de 
ranger et organiser les trains aux gares et allocation des fourgons aux trains. 
Les fourgons sont ranges aux gares et mus par trains a travers du systeme. 
Le output contient des mesures differentes pour un systeme d’operation dun 
chemin de fer (temps du depart & Parrivee, volumes, longueur des trains, re- 
tardements h cause des fourgons, dans les gares depens des operations). 


NETWORK RETRIEVAL OF FREIGHT RATES 

Un Recouvrement des Tarifs Marchandises par un Reseau 

William D. Cassidy 

Military Traffic Management and Terminal Service 
United States of America 

1. INTRODUCTION 

Determining the price of freight shipments moving in the United States 
domestic freight system is a task of enormous complexity. Not counting rates 
for air movements, the structure of transportation rates fills some 12,000 volumes 
of tariffs published directly and indirectly by American carriers, only a small 
part of which are passenger rates. Moreover, the tariff structure is not static; 
rather there is a continuous flow of rate changes from the transportation 
industry. 

Thus a shipper is burdened with more than the task of selecting the most 
suitable mode and service for moving his shipment: he must search the rate 
structure to determine price choices if he is concerned with optimizing his 
shipping costs. Generally, he is not guaranteed optimum price quotations by 
carriers for a given combination of transportation, scheduling, and incidenta 
services, for as a rule a number of prices are possible in the tariff structure 
for each combination. 



NETWORK RETRIEVAL. OF FREIGHT RATES 


353 


If the shipper habitually ships a limited number of commodities to a small 
set of destinations, Ins tariff experience eventually provides him with a reason- 
ably good basis for shipment and sendee pricing. As his list of commodities 
to be shipped increases, however, he soon reaches the point at which retention 
and updating of his experience becomes an economic problem. 

The purpose of this article is to examine the application of graph-thcorctic 
techniques to automation of rate-structure searches in an attempt to provide 
shippers with something better than an experience data bank, 

2. SOME CHARACTERISTICS OF THE U.S. FREIGHT 
TARIFF STRUCTURE 

When considered as a problem in computer storage and retrieval, the tariff 
structure may be separated into two categories. One consists of arbitrary 
rates and rules, the other, of rates that may be expressed as computer-pro- 
grammable functions of computer-storable domains. 

Admittedly, the first category contains the larger amount of information, 
but it turns out that most shipments can move at near-optimum rates found in 
the second — for the shipper with n large number of commodities and 
destinations. 

In the information of the first category wc find a large body of constant 
rates for specific commodities that move between few points, Wc also find 
many arbitrary rules, classifications of commodities and locations, individual- 
carrier exceptions to group tariffs, and other miscellaneous data. 

The rates of the second category arc those that can be expressed as functions 
of the variables regarded by carriers as factors that determine transportation 
costs. The most common arc weight, sire, density, and distance. The list, 
however, is long: it extends to such surprising variables as temperature and 
day of year. 

The functions themselves range from simple to complex expressions. 
They may be continuous or step functions or they may be functions of one 
or several \ a rubles. 

Nearly always, when distance may be treated separately in evaluating the 
function, the distance rate base h the shortest distance between the origin 
and destination — the "short-line distance"; that is, if several carriers join in 
a distance-based tariff, the rate base between two points for all parties joining 
in the tariff will be the distance traveled by the carrier with the shortest route. 
This rate ba^e is binding on all, even those carriers that are not permitted to 
travel the short-line routes and who actually transport their shipments over 
longer routes. 

An important example of distance-based rates is the group of class-rate 
tariffs for rail carriers filed under Docket 2S300 with the Interstate Commerce 
Commission. For tin’s set of rates (and mam others) price is a function of 
general commodity characteristics, weight, and short-line distance. Commodity 
characteristic- are attended to In categorizing shipments into groups of similar 
o unmodified To each* group is assigned a factor to be applied to the value of 
a simple function of distance and weight. 



354 


WILLIAM D. CASSIDY 


The most complex subalgorithm in evaluating the whole function \hn 
expresses Docket 28300 class rates is that of determining the short-line dbtarcr 
between origin and destination. For a large shipper this evaluation mav \n H 
to be made thousands of times a day if he is to have an effective tariff-structure 
search for his shipments. The remainder of this article discusses computer 
algorithms to determine these distance rate bases. These algorithms are 
on network models of carrier systems identified with the various distance- 
based tariffs, such as those of Docket 28300. 

3. GENERAL NOTION OF THE MODEL 

Docket 28300 class-rate tariffs consist of tables of rate bases (entered with 
origin and destination), tables of rates (entered with rate bases and commodity 
category factors), and a set of constraints on the use of the tariffs. The rate 
bases are a function of short-line distance, and there is a rate base in the tables 
for each pair of points in the rail net. 

To store and retrieve the rate bases the simplest computer technique would 
be to parallel the existing manual system by a table-lookup technique. Such a 
technique, however, would call for some 2,500,000 storage entries for Docket 
28300 tariffs alone. When wc consider the multitude of distance-based tariffs 
other than Docket 28300, the economic infeasibility of a table-lookup technique 
becomes clear. 

So we choose our model from the other end of the scries of possibilities; 
that is, a loopless, undirected, connected graph with no parallel arcs. The 
information we store is the record of adjacent vertices and the weights on the 
edges connecting them. Identifying adjacent vertices in this least-informatinn 
model is achieved by ordering edge weights in any of a number of ways. This 
approach reduces the number of storage entries for Docket 28300 tariffs to 
about 35,000. 

The problem of retrieving rate bases now becomes one of finding sums of 
weights on the edges of minimum paths between origin and destination. 

4. CONSTRUCTING NETWORK MODELS OF TARIFFS 

The table-lookup retrieval technique may be regarded as an extreme variation 
of the model just described. Here the graph is completely connected in the seme 
that there is an edge for every pair of nodes, but the information that must he 
stored is highly redundant. 

At the other extreme the least-information model has but one set of edges 
corresponding to a minimum path between any pair of vertices. This mode! 
can be made isomorphic to a somewhat idealized geometric representation c. 
the rail map itself. 

To construct the model we may start with the information in the published 
tariff and eliminate redundancies by an iterative process until we have reduced 
the data basis to a set of weights on edges between vertices that nearly corre- 
spond to links between rate-basing points of the carriers’ real world. An impor- 
tant constraint on the least-information model is that it must reproduce t 


356 


WILLIAM D. CASSIDY 


Vi = { x lt x 2 ,..., x m } — the set of vertices considered in the ith iter- 
ation of the algorithm, 

U{ —value of the upper bound at the beginning of the fth 

iteration. 

x 0 origin vertex for a given retrieval task. 

x d = destination vertex for a given retrieval task. 

Dt = {d(x 1 ), d(x 2 ), d(x k )} 

ssthe set of distances associated with Ff. 

F(xj) = the set of vertices adjacent to Xf. 

R(x 0 , x d ) —rate base between x Q and x d . 

In the unidirectional algorithm the computer program fans out from t he 
origin to one set of adjacent vertices after another, rejecting all new-found 
distances to each vertex that are not less than those already found and all dis- 
tances greater than the upper bound. As soon as the destination is reached, the 
upper bound begins to decrease monotonically as subsequent iterations elimi- 
nate all but minimal distances. Note that the origin-destination minimum 
distance is found, but all other minimum distances from the origin are not. 

The steps of the unidirectional algorithm are as follows: 


1. If x 0 = x<x , then R(x 0 , x d ) = K, 
intracity traffic for the freight carrier, 
algorithm continues. 

2. U[ — diameter of G. 

3. FT -{*•}. 

4. Di — {d{x 0 )} = {0}. 


5. 


i*1. 


where K is a constant rate base for 
The algorithm stops. If x 0 ^x d , the 


6 . 

7. 

8 . 

9. 


Increase i by unity. 

Ui=U[_ v ' 

V; — [ U F(xj)] U VU- 
A = J U [d(x^^(xj,X t ) = d(x t ) ] 

\ xteYt 


U DU 


10. n=v { -{ u-V/- 

11. D\ = “ min^A:^), d(.r£)]} for every .r£e Ff that is identical with 

jfjt e Ff, A 9 ^ k'. 

12. D- = D[ - Sd{x d )}, U\ = £/,- , F? - V\ - {x d }, if * e F- and </(*,) > IT,. 

D'{ = A : , &*i = F;.' = F- , if 6 F- and 4%) < U t . D- = D\ , 4= 

F? = F/ , if $ Vi . 

13. Ff = Ff — { IJ - Y /} for every V'{ such that d{x t ) > J7-. 

14. R(x 0 ,x d )= U[ if no change from 2)^ to D> . Return to Step 6 other- 


wise. 


7. BIDIRECTIONAL ALGORITHM 

Let the symbols remain as before, except as follows: 

Vi — {x( t u x it2 , . . . , X( 9 t} = the set of vertices consideredin the [(/ -f lyZjfi 

iteration of the algorithm in steps from the 
origin vertex. 



NETWORK RETRIEVAL OF FREIGHT RATE” 


357 


Zt “ a 7 , 2 . . . • , the set of vertices considered in the (§)th 

iteration of the algorithm in steps from the 
destination vertex. 

Ei = i), c(.v/ t 2 ), . . . , f(a/.n)} the set of distances associated with Z \ . 

Timing tests, of the unidirectional algorithm, on a computer showed that 
when the upper hound began to decrease, the unidirectional algorithm converged 
rapidly to the solution. In an effort to reduce computation time up to the com- 
mencement of a decreasing upper bound, a bidirectional algorithm was devised. 

In the bidirectional algorithm the computer moves out alternately from the 
origin and destination to one set of adjacent vertices after another, rejecting all 
new-found distances to each vertex that are not less than those already found 
and testing continually for vertices reached from both origin and destination. 
As soon as some vertex has been reached from both terminals, the sum of the 
distances to the vertex (from each terminal) becomes the upper bound. The 
algorithm continues in the same manner as the unidirectional algorithm to 
complete the convergence to the solution. 

If the density of vertices ==A were uniform throughout the graph, the 
number of vertices (AY) that would be considered by the bidirectional algorithm 
up to the commencement of convergence is approximated by 

A r 2 * 2 7T(PA, 

where d is the distance from the origin to the first vertex reached commonly 
from origin and destination. On the other hand, the number of nodes (A r g) 
considered by the unidirectional algorithm up to the same point is approxi- 
mated by 

N\ x ~(2dfA. 

Thus die expected relative time of computation in the preconvergcncc part of 
the algorithm is approximated by 

As_l 
X\ ~~ 2 * 

The introduction of the unidirectional technique at the commencement of 
convergence allows us to use the upper-hound device and nearly always pre- 
vents the redundant solution of the entire minimum spanning tree rooted at the 
origin. 

The steps of the bidirectional algorithm are as follows: 

1. If av, = where K is a constant rate base for in- 

irncity traffic for the freight carrier. The algorithm stops. If x 0 ~xa t the 
algorithm continues. 

2 . Z\ = {xj). 

3. Ih - {</(*>)} =- (0); El « - (0}. 

4. *=-l;/ r»L 

5. Increase i and / by unity. 

6. i> [ u r(Aj)l u r;_,. 

r« t r»-i 



358 


WILLIAM D. CASSIDY 


7. 

8. Z> ( = 

9. E t = 


U E(x m ) 

1 




U i d ( x i) + w ( x i . x i) = 4**)} 

c v{-i 

U {<*»») + tv{x m , x„) = e(x n )} 

iCn G 

e Z|-I 




10 . V' i =v i Ux' k . 

11. Z)J= U {</(**) = min|Vf(**), ^(4)]} for every x,_. e V { that is identical 
to x' k e V iy k ^ k f . 


12. Z\ = Z t -\Jx n .. 

13. £/ = (J {e(x n ) = min[e(x„), e(x n .)]} for every x„. e Z t that is identical 
to x n e Zi , n n r . 


14. If x h =£ x n for every x k e V\ and every x n e Z{ , return to Step 5. If 
*h = for any ** e F| and any e ^ • , set U } = minj [d(x k ) + e(x n )] y Vf = 
and continue with Step 6 of the unidirectional algorithm. 


8. FRONTIER VERTICES AND ABBREVIATED NETWORKS 

Docket 28300 tariffs divide the continental United States (excepting the Far 
West) into five territories. A set of rate bases for each territory, as well as a 
number of sets for vertex pairs between territories, is available. Rate-base 
retrieval time can be considerably reduced for vertices in different territories 
(particularly those that join at no vertex) if the routes through territories other 
than those of origin and destination are searched by means of abbreviated net- 
works. Such a network for a given territory consists only of the boundary vertices 
the territory has in common with other territories, along with the set of weighted 
edges connecting these frontier vertices. Whether the computer will search a 
particular territory’s network in abbreviated form or in complete form is deter- 
termined by an initial decision subroutine. The economy of this technique 
depends on computer storage cost and the profit in increased retrieval speed. 

Another form of network abbreviation results from omitting all or some of 
the vertices of degree two. Here again the value of increased retrieval speeds 
must be weighed against increased storage costs. The identities and weights of 
two adjacent retained vertices for each omitted vertex must be stored and 
retrieved by a table-lookup routine when the omitted vertex is wanted for an 
origin or destination. 

In some cases it becomes economical to replace subnets that contain vertices 
of degree greater than two with a single weighted edge. These subnets may be 
identified by a probabilistic analysis of traffic patterns (as reflected in system 
inquiries). It is quite likely that such an analytic routine could be included in 
the computer program, and the computer could effectively increase its efficiency 
by moving data between the network-storage data base and the table-lookup 
data base. 


OPERATIONS RESEARCH ON THE JAPANESE NATIONAL RAILWAYS 


359 


9. REFERENCES 

[1] IJ. V Martin, “Minimum Path Algorithms for Transportation Planning/* Massa- 
chusetts Institute of Technology, Cambridge, 1963 

[2] Thomas L Saatv and RownT G. BtSVCKER, Finite Graphs and Networks: An 
Introduction ruth Apphcatt&ns, McGraw-Hill, New York, 1965. 


UN RECOUVREMENT DES TARIFS MARCHANDISES 
PAR UN RESEAU 

RESUME 

La structure du tarif des marchandises des Etats Unis est un ensemble de 
renscignements ires grand ct complique. Un probleme pour permettre Tacccs 
du compwcur a cettc structure cst I c developpement dc la technique de rapporter 
la distance la plus courtc pour les transporteurs entre deux points geographiques. 
Cette methode doit etre precise au point de doubler les inconsistances appnrcntes 
entre ces distances. 

Les techniques theoriques-graphiques sont appliquees au probleme et deux 
u algorithms ” dc computeur sont developpes qui produisent unc decroissante 
limite superieurc pour augmenter Tefficacite du computeur. 

Le rapport cn plus discute d’nutrcs roethodes, un peu particuliercs au 
probleme spccifiquc, pour augmenter Tefficacite de Tnlgorithm. 


OPERATIONS RESEARCH ACTIVITIES ON THE JAPANESE 
NATIONAL RAILWAYS 

Activiics dc Recherche. Opcrationnelle aux 
Chcmins de Fcr Nationaux Japonais 

Shun-iciu Un are 

Japanese Rational Railways, Rafhvay Technical Research Institute, Japan 


1. OPERATIONS RESEARCH ACTIVITIES 

Even before 1^45 Japanese National Railways (JNR) was engaged in studies 
that today we call operations research. The results of these studies were made 
available in a report on the u*c of simulation to manage freight-car movements 
in a marshaling yard and another on the anatvsis of line capacity. 

A'* soon z* JNR recovered from the devastation of World War II, it intro- 
duced technical renovation in an effort to upgrade its activities. In addition 



360 


SHUX-ICHI URABE 


to, or correlated with, this reconstruction were new operating procedures 

based on one of the most powerful means of achieving new growth operations 

research, with its unique ideas and methods. In or around 1952 study groups 
in operations research were organized in several sectors of JNR and its tech- 
niques were extended to many units of the network. 

In anticipation of the wide applicability of operations research, the Railway 
Technical Research Institute in 1956 established the Operations Research 
Laboratory, uniting on its staff mathematicians, physicists, and other technical 
experts. At the same time, those in JNR’s head office who were interested in 
operations research formed a group that was authorized in 1960 by the Office 
of Long-Term Planning to function as an Operations Research Center an 
official staff arm of JNR. In the meantime marketing research had been intro- 
duced into the traffic department, and random sampling and inventory control 
techniques w'ere studied by the department responsible for purchasing, storm* 
and distributing supplies. Many new methods were adopted: the control- 
chart and experimental design methods were employed in rolling-stock repair 
shops, queuing theory was applied to the analysis of marshaling-yards operations, 
and dynamic programming was used to coordinate operations at hydroelectric 
and thermal power plants. In other words, operations research steadily gained 
in popularity in various sectors of JNR. It should not be overlooked, however, 
that faculty members of Japan’s leading universities acted as advisors to JNR 
and thus assisted in the implementation of its considerable achievements. 

Other studies connected with operations research were carried out in local 
units as JNR, as OR and IE found wider application in the operation of rolling- 
stock repair shops scattered all over the system. An annual symposium on the 
application of operations research to railway operations was inaugurated in 
1958 under the auspices of the Office of Long-Term Planning. This symposium 
has since held eight sessions, at each of which numerous papers were submitted 
and from which 30 or more were selected and published each year. 

The Operations Research Center provides instruction and guidance in the 
solution of various technical problems by operations research methods and 
manages those operations research projects that encompass the entire JNR 
system. 

The Railway Technical Research Institute, with the Operations Research 
Laboratory as a nucleus, employs operations research methods whenever prac- 
ticable in the technical research conducted in its laboratories. The following 
is an outline of three of the latest reports on the research done by my colleagues 
and me at the Train Operation Research Laboratory. 

2. THREE EXAMPLES 

2.1. Simulation of a Transport System of Commuter Trains 

In Japan urban areas have spread so rapidly that the commuting public 
has been markedly increased. Safe transport of these people has become the 
responsibility of JNR. The purpose of this study w*as to determine the effects 
of various factors on transport capacity and to improve scheduling methods. 


OPERATIONS RESEARCH ON THE JAPANESE NATIONAL RAILWAYS 361 

The Chuo Line, one of the most important rail sectors in the Tokyo area, vras 
selected ns the subject. 

First the phenomenon of commuter traffic was analyzed statistically, and 
mathematical functions concerning the number of passengers to be transported, 
the number hoarding and alighting at stations, the time required for these 
movements, and the actual operation of timetables, were developed. These 
functions were used in the construction of a model of commuter sendee, and a 
computer program to simulate the operation of the model was written by 
Professor M. In. 

Many simulations were carried out on the CDC-20 computer, and studies 
were made of the number of passengers left on platforms, the delays in 
arrivals and departures, and a method of improving schedules to accom- 
modate population growth. This work was done under 'the supervision 
of Professor S. Moriguchi. 

2.2. Determination of Safety Factors in the Prevention of Train 
Accidents 

The provision of facilities for accident prevention to ensure the safety of 
trains is one of the most important problems in planning train operations. 
This research was aimed at estimating the number of accidents per station as 
a function of all factors with a bearing on the incidence of accidents. Thus 
the number of accidents per station could be expected to change as these factors 
varied, and the safety factor could be determined mathematically. 

Data gathered included the number and kinds of trains, their speeds, the 
ratio of passenger to freight cars, and the ratio of single to double track. Acci- 
dents were classified into those of trains on line haul (including trains on sidings) 
and trains in marshaling yards. A functional equation was written for each type 
of interlocking system. 

The number of accidents per station was expressed in a multiple regression 
formula: 

Y = <*a 4- i*i -r ^2*2 4 + x n , 

where Yj, y?. .... .v n arc individual variables. 

2.3. Minimization of Total Expense in the Purchase, Distribution, and 
Mixing of Coal for Steam Locomotives 

Wien coal-producing areas and locomotive depots arc widely scattered, 
the linear programming method will generally indicate how to buy and distribute 
coal of different grades to locomotive depots, how to mix them to suit different 
needs, and how to minimize the total expense. 

If an attempt is made, however, to express mathematically all conditions 
relevant to the use of coal and thus to include them in the system, many non- 
linear factors will make their appearance. In this study nonlinear factors 
were given linear representation in order to arrive at an integrated linear 
programming system, and the system was solved on a CDC-20 computer. 
The purpose of the study was to computerize the purchase, distribution, and 
mixing of coal. 



362 


ARN'S JENSEN 


ACTIVITES DE RECHERCHE OPERATIONNELLE AUX 
CHEMINS DE FER NATIONAUX JAPONAIS 

RESUME 

D’abord l’histoire sur les activity de recherche operationnelle aux Chemins 
de Fer Nationaux du Japon est expliquee brevement. 

Ensuite, a titre d’exemples, trois etudes fait par l’auteur avec ses collegues 
recemment sont presentees: 

1. Achievement du programme de simulation reproduisant les phenomenes 
des grandes trafics quotidiens des voyageurs banlieusards. 

2. Etablissant des normes des equipements de securite, le nombre des 
accidents par gare est expose par formule de multiple-regression lineaire 
compose par des facteurs consideres comme des causes des accidents, 

3. Systematization complete des achats, distribution et melange des char- 
bons de terre pour locomotives a vapeur par le modele de programmation 
lin6aire. 


SAFETY-AT-SEA PROBLEMS 

La Securite dans des Problemes Maritimes 

Arne JensenJ 

Institute of Mathematical Statistics and Operations Research 
The Technical University of Denmark 


One of the problems of practicability is to determine how relevant measurements 
are to be obtained. People talk about “ imponderables ” as if they were small 
demons lurking in their offices. And indeed, if one says: I must know what is 
going to happen on the 13th of June next year, then there is no means of knowing 
and that is that. But there is no point in asking absurd questions which, in the 
nature of things, cannot be answered and then setting up a wail about im- 
ponderables. A better approach, better because it works, is to invite the scientist 
to examine the problem situation from every possible slant, reshuffling it, as it 

f With permission of Stafford Beer I have used part of his English version of m> 
early presentation of my work given in a seminar in 1962. People who know his an m> 
styles will easily recognize Beer's contribution. 


SAFETY-AT-SEA PROBLEMS 


363 


were, in search of a commensurable approach. The scientist will look for 
some critical feature of the situation that is, on the face of it, measurable, and 
he will try to find a metric that fits it. I will give you an example. 

There is a narrow channel between Denmark and Sweden that provides 
the only entrance to the Baltic. The narrowest crossing is between Helsingor 
(which lovers of Shakespeare usually call Elsinore) in Denmark and Halsingborg 
in Sweden. We Scandinavians transport ourselves and our goods freely across 
the sound. The journey by ferryboat takes less than half an hour and little fuss 
is made about such formalities as passports. Xow the cross-channel traffic 
is steadily increasing and so is the sea-going traffic between the Baltic and the 
outside world. Thus there is an increasing congestion in the channel and 
obviously an increasing risk of collision. 

The governments of Denmark and Sweden set up a joint commission to 
investigate this situation. In particular, there could well be a bridge between 
the two countries. Consider, said the commission, the state of affairs when the 
traffic in both directions has doubled. Needless to say, it is not very difficult 
to make a statistical forecast of the date when this will occur, nor is it difficult 
to support the statistical extrapolation by arguments from the plans and in- 
tentions of all concerned. Clearly, one of the factors, the risk of collision, will 
he increased, compared with the risk today. The layman might very well 
imagine that when the traffic is doubled the risk of collision will be doubled; 
on the other hand, he might very well wonder whether this would be true. In 
any case, it is likely that the layman will suppose that no conceivable way of 
computing this risk could be obtained short of running an experiment, whether 
in the water or by computer simulation, in which a doubled traffic was allowed 
to run. An experiment in the water is clearly impracticable, for it would take 
an enormous amount of organization and money, even if it could be done. A 
simulation may well be impracticable, too, for the very good reason that col- 
lisions depend in the last resort on the failure of human beings standing on 
ships* bridges to avert them. It is difficult indeed to say what role will be played 
by the human clement in a situation that no one has yet experienced. 

In these circumstances the commission very properly sought operations 
research advice. I was asked to measure the increased risk for a doubled traffic 
flow. Many people said it could not be done; the research is impracticable; 
the risk is imponderable. 

I had, in fact, no more idea than anyone else, at the start, how to set about the 
task of making this measurement. I talked with the captains of both ferryboats 
and ocean-going ships about the problem, and everybody agreed that the risk 
of collision would be higher and wagged their heads. This did not take the 
problem much further. Now I knew that, given a knowledge of the stochastic 
processes governing the movement of ships, I could calculate the likely number 
of incidents in which two ships would enter the same arbitrarily sized body of 
water at the same time, but no one could tell me how near to each other the 
ships had to be before they could be said to have embarked irrevocably on a 
collision course. The practical men, vers' naturally, said that if the ships missed 
cadi other — even by a hair’s breadth — then there was no collision, whereas if 



364 


ARNE JENSEN 


they did not, there was. It is quite clear that if one computed the probability 
that two ships would arrive simultaneously in an area of water of the size of one 
ship, one would not be estimating the chance of a collision. Such a computation 
would be based on a model in which ships would suddenly appear at a point 
whereas in reality we would obviously be dealing with an interacting system of 
some complexity. 

So, still entirely puzzled by the key problem of establishing a metric I 
decided that as an empirical operations-research scientist I should at least try 
to obtain some facts. People were free with opinions, beliefs, and promiosti- 
cations; but the first step should be to make some kind of measure of some kind 
of event that had actually occurred. Accordingly, we made a film of the traffic 
moving about on the sound. What was photographed was a radar screen placed 
on the shore just outside the castle of Kronborg, on which the movements of 
ships appear as they would familiarly appear to ships’ masters. The camera 
recorded the state of the screen at discrete time intervals and not continuously. 
The result was a correct record of movements, except that everything appeared 
to be happening at about two hundred and fifty times the proper speed. 

Having obtained some basic facts, I did not know what to do next. I did, 
of corns e, study the him, and 1 observed certain areas in which congestion was 
characteristically high, but neither I nor my colleagues could yet find a means 
of defining a collision risk. My r next move therefore was to mount my film in 
a theater and to show it to a group of very experienced men that included six 
who had been ships’ masters themselves and now held posts in administration 
on behalf of responsible authorities. One had become an accident inspector, 
another was the chief of the harbor authority, another was running the ferryboat 
service, and there was a representative of the Navy f . These experts were asked 
to collaborate by r watching the film and trying to detect dangerous situations. 
How near would two boats have to approach each other before a genuine col- 
lision risk was involved ? If the captains, having watched this film, could apply 
their own experience to it, they might help to suggest the approach to a risk 
metric which was wanted. Unfortunately they could not. 

The cybernetic computer in the cranium, however, often tells its owner 
things he cannot analyze and report about to professors of operations research. 
I noticed that there were moments of tension among my audience. The experts 
would catch their breath in unison; there would be a straining forward in the 
seat ; there would be smothered exclamations. When I noticed that these in- 
cidents were occurring, I stopped the film. It is emphasized that this was not 
part of the plan ; it was opportunism, but there were some more facts, if only 
some use could be made of them. 

In all, some 40 incidents of this kind were noted down, with the times at 
which they f occurred from the start of the film. After the experts had gone 
home we made a careful analysis of the frames on the film corresponding to 
the 40 incidents and presented them to the experts, who struck out about 14 
due to velocity of the film. (See Figure 1.) 

I was looking for a pair of ships that had become dangerously close in 
the hope that the threshold of danger could be measured. I was looking or 



SAFETY-AT-SEA PROBLEMS 


365 



Figure 1* Expected collision places for 26 risky situations. 

conglomerations of many ships that might have looked threatening to experienced 
sailors. 1 found neither of these things. What captain F. Ileimdal and I did 
find in nearly twenty out of the twenty-six cases was a dear group of three ships, 
not appallingly close to each other, but still three. I recalled the experts and 
demonstrated this to them. (See Figures 2 and 3.) 

The situation suddenly became very clear. The codes of seamanship by 
which captains navigate their craft are based on a binary logic. If the captain 
of ship d sights ship B % he has a set of rules that enables him to decide how to 
steer in relation to ship /i. Since ship B has the same set of rules, it takes 
complementary action that is consistent and the ships pass each other safely. 




368 


ARNE JENSEN 



B and C are shaping direct courses toward each other and both must con- 
sequently hear away to starboard. This turn may bring C in conflict with A, 
which otherwise would have passed clear of C. (See Figure 4.) 

A must bear out of the way of B; A may bear away to starboard or reduce 
speed; B must keep its course and speed. C is on a collision course with A; 
A must now give way to B but will have to keep course and speed because of C. 
(See Figure 5.) 

Now the situation that develops when a third ship appears on the scene is 
evidently dangerous. The master of ship A, who is conducting a simulated 
dialogue with the master of ship J5, suddenly has to enter into a similar dis- 
course with the master of ship C. Moreover, the action that the navigational 
code requires ship A to take in relation to ship B may not be consistent with the 
action it is supposed to take in relation to ship C. While the master of ship A 
is worrying about this, the masters of ships B and C are also faced with their 
versions of the dilemma, and the simulations may well become intolerably 
difficult. 

In fact, the difficulty of treating a triadic relationship in terms of a binary 
logic is notorious among logicians, never mind shipmasters. In practical terms 
the human brain boggles at the difficulty of analyzing a dynamic triadic situation, 
even when there is a capability to pass direct information. This is the reason 
why the search for incidents involving a considerable number of ships vent 
unfulfilled. The mind trained in seamanship, and working binary logic, simp} 
cannot encompass the further area of sea and the larger number of ships. 



SAFETY - AT-SEA PROBLEMS 


369 


Here, then, we see the OR scientist grappling with a problem of measure- 
ment that appears to be impracticable, without knowing in advance what he 
really intends to do and carrying in his mind ideas he thinks must be right but 
which will actively mislead him unless he keeps his wits about him. Here, also, 
we find the experienced practical man precluded by that very experience and 
that very practicality from understanding precisely the difficulty in a situation 
he is trained by long experience to sec in practice. The captains agreed with the 
analysis, when they saw* it, but were incapable of making it themselves. Nor, 
incidentally, is there any obvious solution to the problem now defined. One 
cannot readily legislate for three-way radio conversations between ferryboats 
and ocean-going ships of different nationalities which are within distance of 
each other for so very short a time. Nor is an OR man likely to propose a solu- 
tion to a local problem of this kind by demanding that international shipping 
codes be radically changed. This was not my problem. My problem was to 
calculate the increased risk in a doubled traffic flow. 

The answer is now terribly simple. According to empirical analysis of the 
data, the distribution of ships on the water is Poissonian; that is to say, it has 
a structure characteristic of chance interactions. If the traffic flow' doubles, 
and the triadic relationship of vessels is the dangerous one, the rise in the risk 
of collision is 2 3 = 8. So the risk of collision docs not double but is eightfold 
in the circumstances proposed. 


The Distribution of the Number of Ships in a Ten-Sqvarc-Mile 
Observation Field , Jlcllchtrk 


Observed Number 
of Ships 
i 

Observed Number 
of Times 

7U 

Calculated Number 

m 

0 

69 

59 

l 

9S 

103 

2 

83 

89 

3 

53 

51 

4 

18 

22 

5 

11 

8 

6 

1 

2 

7 

2 

1 

S 

0 

0 

Total 

335 

335 


Thh report was put before the Bilateral Commission and caused a furore, 
I was denounced by people — practical men — who believed tiiat the rules at sea 
were foolproof. Provided they were obeyed, there could be no accident. Here 
is the thought-block of the man who believes that a hich-\,iriety system can be 
controlled b\ low -variety regulations— provided they are obeyed. It is an in- 
competent notion unless a variety amplifier is present, and it is not in this ca^c. 
Some of those concerned had seen the force of the argument and the sense of 





370 


R. J. PARENTE AND D. F. BOYD 


the measurement. They carried the day, perhaps because in the middle of tL 
debate a collision actually occurred outside in the sound. Fortunately it ^ 
not a serious accident, but it is a sobering thought that men often wait* for the 
inevitable tragedy before deciding that it may possibly occur. 

For those who are wondering about the potential bridge, it should be added 
that there were collision risks with the bridge itself to take into account and 
also other considerations about the advisability of setting up the international 
link elsewhere. The crossing from Copenhagen to Malmo, though much longer 
has economic attractions. This gives rise to the thought that if there were t::o 
bridges a kind of inland sea would have been created between them. So there 
are many other problems involved, and thus scope for more operations research 
but the point of the story remains. It is impracticable for the practical man to 
decide for himself what is practicable science. 

LA SECURITE DANS DES PROBLEMES MARITIMES 

RESUME 

Ce rapport etudie les futures conditions de trafic sur Petroit bras de rner entre 
le Danemark et la Suede, a Helsingor-Halsingborg. Le probleme est de calculer 
F augmentation du risque de collision si la circulation est a double couram. 
Comment va-t-on obtenir des faits justificatifs ? II fut decide d’enregistrer les 
mouvements des bateaux sur un film au moyen d’un ecran de radar. (Ce film 
sera presente au cours de la conference.) Ce film fut presente a un groupc 
d’hommes tres experiences dans la detection des situations dangereuses. On 
remarqua qu’une relation triadique de navires est dangereuse, puisque les codes 
de manoeuvre sont bases sur une logique binaire. Le risque de collision aug- 
mente alors huit fois lorsque le trafic est double, alors que les navires dans un 
champ d’observation suivent, d’apres les donnees, une distribution de Poisson. 


A FORMAL DESCRIPTION OF 
STEAMSHIP CARGO OPERATIONS 

Une Description Formelle des Operations (Fun 
Paquebot Cargo 

R. J. Parente and D. F. Boyd 

International Business Machines Corporation 
Yorktown Heights , Nezv York 
United States of America 

1. INTRODUCTION 

Systems studies are frequently based on mathematical or physical represents' 
tions, or models. The experiments conducted allow a systems analyst to obsenc 



A FORMAL DESCRIPTION OF STEAMSHIP CARGO OPERATIONS 371 

performance under specified conditions, and an ordinary study procedure 
usually takes this form: 

An experiment is designed. 

A mathematical or physical model is constructed. 

The conditions of the experiment arc specified. 

The model is observed under test conditions. 

To conduct a study of this kind with the aid of a digital computer-based 
simulation system the study procedure is revised as follows: 

1. An experiment is designed. 

2. The sj’Stcm under investigation is represented in a formal manner 
acceptable to the simulation system. 

3. The experiment is specified in a formal manner acceptable to the 
simulation system. Included here are specifications for the conditions of the 
experiment and for the observations of the model. 

4. The experiment is performed by running the simulation system with 
the information specified in (2) and (3) as input. The output is a report of the 
observations specified in (3). 

This article is concerned only with (2). A method of representation is 
discussed and illustrated by a model of steamship cargo operations. 

Models of this kind have a potentially significant role in business and 
industrial applications as a tool for extending staff support to management. 
A wide range of useful experiments relating to the evaluation of alternative 
methods of conducting operations under various assumptions regarding the 
business environment may be carried out. 

2. SYSTEM REPRESENTATION 

There arc two aspects to the representation of a system; the first is concerned 
with the identification and specification of each component that will exist in 
the system; the second pertains to the specification of the system dynamics. 

Any system component that is to be manipulated or observed may be repre- 
sented as an entity. Some simple examples in a port are a ship, a berth, and a 
cargo. Examples of more complex entities arc a set of ships and a set of cargoes. 
In these instances the set can be considered as an entity that is to be manipulated 
or observed. Each entity is described, that is, its structure is specified, by 
associating with it certain characteristics or “attributes’* of interest to a par- 
ticular system representation. The values of the attributes will reflect the 
changes of state that result from the action of the system processes. 

Individual entities of each of the specified types arc entered into, or “ created ” 
in, the system. They exist in the system for a period of time and then leave it 
or they are “destroyed.” Assuming that the length of an experimental run is 
from simulated time T n to 7\,. the period of time that a given entity exists 
may begin at 7\, or later and end at T f; or earlier. Tin’s implies some mechanism 
for imtialiring the state of the system at T c as well as a mechanism for expanding 
or contracting the system while it is in operation. 



372 


R. J. PAKENTE AND D. F. BOYD 


The description of the state of a system at T 0i or at any T % , is not complete 
without a description of the state of the processes taking place. A process is 
an entity that is not fully specified by its associated attributes. It also possesses 
dynamic characteristics; that is, a behavior pattern is associated with it. Like 
other entities, a process enters the system (when it is started), it exists for a 
period of time (while it is taking place), and then ceases to exist (when it stops) 

The identification and specification of each entity to be represented are 
referred to as the “static description” of a system. The static description of a 
process does not include a specification of its actions; instead, the actions of 
the system processes are specified as a behavior pattern for each process in 
the system. The set of all the behavior patterns specifies the system dynamics 
hereafter referred to as the “dynamic” description of a system. 

The behavior of a process is specified as a series of actions that occur as 
events over time. Event-to-event relationships within a process may be time- 
dependent or they may be dependent on the state of the system. Since several 
processes may be taking place simultaneously, interaction may also occur. 
Complex structural relationships, involving several types of entities, may be 
formed. 

The actions of a process represent change in a system in four ways: 

1. They expand or contract the system by creating and destroying entities. 

2. They rearrange the system by moving entities in and out of sets. 

3. They modify the state of an entity. 

4. They affect the interactions in the system. 

Figure 1 shows the relationship between the static and dynamic descriptions; 
it is significant that a process has both static and dynamic characteristics. 

System 






A rORMAL DESCRIPTION OF STEAMSHIP CARGO OPERATIONS 373 

Briefly stated, preparing a system description entails specifying each type of 
entity that will exist while the system is operating* For each entity that repre- 
sents a system process a corresponding behavior pattern is specified, which 
describes a series of actions that occur over a period of time. When the model 
is operated, the combined effect of the specified actions is to simulate the changes 
of state that occur while a system is in operation. 

3. THE SYSTEM 

The system herein described consists of the operation of a scheduled steamship 
cargo service. The general pattern of operations is as follows: 

A fleet of cargo vessels operating under a government subsidy contract 
makes regular calls at a specified scries of ports on each of two continents. At 
any specified port vessels may load cargoes destined for a specified port on the 
other continent (i.c., coastwise operations arc not permitted). Various vessel 
itineraries may be followed or, in other words, not all vessels must call at all 
ports. All ports must be served, however, and total annual sailings must be 
maintained within a specified min-max range. 

3.1. Entities 

Principal entities arc listed with an incomplete but typical set of attributes. 

Shipments 

ATTRIIJUTES 

Stowage category (dry, liquid, refrigerated) 

Commodity class 
Tonnage 
Destination port 

Cargoes 

ATTRIBUTES 

Stowage category (dry, liquid, refrigerated) 

Destination port 
Tonnage 

Ports 

ATTRjni’TES 

Average berthing delay 

Discharge rate in tons per hour 

Loading rate in tons per hour by stowage category 

Vessels 

ATTRIBUTE? 

Normal speed in knots 

Stowage capacity in tons by category (dry, liquid, refrigerated) 

Total allowable deadweight tons 



R. J. PAHENTE AXD D. F. EOYD 


374 

Schedules 

“Expected” time intervals for steaming and port operations for a ^ 
vessel and itinerary. 

3.2. Principal Processes 

Cargo Generation (causes shipments to become available at ports in rrV 
amounts at typical time intervals) ~ 

Dock Operation (collects shipments into cargoes) 

Voyage (process of taking a vessel to each port on a schedule) 
Discharging (time variable with nature and amount of cargo) 

Loading (time variable with nature and amount of cargo) 


4. MEASURES OF PERFORMANCE 

System performance characteristics of interest include the following: 

Total revenue 
Vessel utilization 

Average tons on board per tons capacity 
Steaming time versus port time 
Port-to-port cargo transit times 

If appropriate sets of “ cost attributes ” can be specified with an accountE? 
structure for all processes, these measures of performance may be extended 
to include a profit-and-loss evaluation of various modes of operation. 


5. FORMAL DESCRIPTION OF A SYSTEM 

A formal description of the system introduced in Section 3 is given in Section 6 
(A Representative Model of Steamship Cargo Operation) in two major secferu. 
A static section identifies all components of the model and a dynamic seethe 
specifies its behavior, 

A typical description is written as follows: 

SYSTEM 

STATIC 


series-of-entitv -attribute-specification 


END 


A FORMAL DESCRIPTION OF STEAMSHIP CARGO OPEPsATIONS 375 

DYNAMIC 


scrics-of-bchavior-specification 


END 

EXECUTE identifier; 

The static section contains the specification of each type of entity in the 
system. Any system component may be represented as an entity, which is the 
basic unit that can be manipulated by the system processes. Each entity is 
specified by a group of one or more attributes associated vrith it. The number 
of attributes determines the level of description required at any point in time, 
and the state of the entity is defined by the values attained by its attributes. 

Specification of a cargo as discussed in Section 3 would be written as 
follows: 

CARGO: 

STOWAGE-CATEGORY 

TONNAGE 

DESTINATION-PORT; 

The identifiers used in a specification may contain as many as 31 characters. 
Allowable characters arc the folio wing: 

1. The letters A through Z. 

2* The digits 0 through 9. 

3. The special characters and _ 

When a specific cargo is created, its name is assigned as the value of a 
variable; for example, X. A reference to an attribute of that cargo, as shown 
in an assignment statement, determines the specific cargo and its attribute. 

X. STOWAGEJTATEGORY = 1 ; 

The basic unit for describing the system dynamics is a process. Like anv 
entity, a process exists over a period of time. While it exists it may be observed 
and or manipulated by other processes; it may also contain its own state, as 
defined by the values of the attributes associated with the process in the static 
section. In addition to these characteristics, a process may be active while it 
exists; in fact, it is more correct to view a process as taking place rather than 
simply existing. Therefore, in addition to specifying a group of attributes to 
he associated twth each process of a given type, it is also necessary to specific 
a behavior pattern to be associated with these processes. These behavior 
patterns, one for each type of process, arc specified in the dynamic section. 
The format of a behavior block is as follow: 



376 


R. F. FAREXTE AND D. J. BOYD 

Identifier: PROCESS (parameter-list) 

LOCAL attribute-group; 
list-of-statements 
END identifier; 

5.1. Parameter List (Optional) 

The variable declared in the parameter list is associated with the variables 
specified in an argument list when a process is started. 

5.2. LOCAL Declaration 

The variables declared in the LOCAL declaration may be assigned inter- 
mediate values produced by the list of statements. Additional LOCAL 
declarations may be made within 

BEGIN END; 

blocks, which allow the grouping of statements to function as a single statement. 

Control statements provide a means of specifying event-to-event relation- 
ships within and interactions between processes. Each process has several 
attributes that define its state with respect to a sequencing algorithm which 
in turn maintains a simulated clock and controls process behavior as specified by 
control statements. These attributes may in general be referenced in the same 
manner in which other attributes of a process are referenced. The sequencing 
algorithm maintains three sets of processes: 

1. Processes scheduled to resume at a future point in simulated time. 

2. Processes ready to resume at the current point in simulated time. 

3. The processes and subprocesses that are currently active. 

The sequencing algorithm is unaffected by unscheduled processes; for 
example, those that have been delayed or are waiting for a signal. A specific 
process is active when its associated behavior pattern is being executed and its 
name has been assigned as the value of the system attribute CURRENT. 


6. A REPRESENTATIVE MODEL OF STEAMSHIP 
CARGO OPERATIONS 

The model described in this section has been simplified for reasons of brevity. 
Particularly, the details of the dispatcher process have been omitted. It would 
be this process that would effect operating decisions, implement schedules, 
and so on. 

SYSTEM l* STEAMSHIP CARGO 

OPERATIONS*/ 


STATIC 



A rORMU. Dfc5CRn*TION Or STE.\MFII!P CARGO OPFRATIONS 


377 


NORTH 

SOUTH 

STEAMING 

BERTHING 

IN_PORT 

ZERO; 

CARGO_GENERATION : 
PROCESS 


COMMODITY_CLASS 


I* TYPE OF PORT */ 

,* VOYAGE STATUS */ 

.* INTEGER 0 •; 

* GENERATES SHIPMENTS OF 
A GIVEN COMMODITY FOR 
EACH PORT OF ORIGIN */ 

;* 1 TO 30 •/ 


A N N U AL_TON S 
I XTERJV R R I VALwTI ME; 


TIME BETWEEN 
GENERATION •/ 

SHIPMENT: 

COMMODITY_CLASS 

STO W AG E_C ATEG O R Y /* 3 TO 3 */ 

TONNAGE 

DESTINATI ON_PORT ; 


/• FORMS CARGOES FROM 
SHIPMENTS WITH SAME 
STOWAGE CATEGORY */ 


CARGO: 

STOWAGE_CATEGORY 

TONNAGE 

DESTI NATIONJPORT; 


DOCKXPERATION: 

PROCESS 

PORT; 


VOYAGE: PROCESS 


VESSEL: 


/* PROCESS OF TAKING A 
VESSEL TO EACH PORT ON 
A SPECIFIED SCHEDULE */ 


VESSEL: 

SCHEDULE 


SCIIED_POINT 


STATUS 


i* INDEX TO PORT OF 
SCHEDULE THAT VESSEL 
IS AT OR IIAS JUST LEFT < 
;* STEAMING, BERTHING, 
INJ’ORT *’ 


TOTA L _A LLO W AB I ,E_ DVT 


TOTAL-DWT 

AVAIL.STOWAGE s 3 ’ •* pv smvAfiF pa T rrcnpv * • 
TONS_BY_DEST v 3,25> 



378 


R. J. PARENTE AND D. F. BOYD 


NORMAL_SPEED /* KNOTS */ 

PORT-COUNT 

VOYAGE_COUNT 

EOV; r PORT AT WHICH A VOYAGE 

IS CONSIDERED TO END */ 

SCHEDULE: 

NUMBER_OF_PORTS 
PORT_OF_CALL <25 > 

EXPECTED_ARRIVAL_TIME <25>; 

/* FOR USE BY THE 
DISPATCHER */ 

LOADING: PROCESS /* LOADS CARGOES ONTO A 

VESSEL AT A PORT */ 

PORT 

VESSEL; 

DISCHARGE: PROCESS /* REMOVES TONNAGE FROM 

A VESSEL AT A PORT */ 

PORT 

VESSEL; 

PORT: 

TYPE /* NORTH, SOUTH •/ 

AV_BERTHING_DEL AY /* HOURS */ 
DISCHARGE.RATE <3> /* TONS/HOUR BY STOW 

CATEGORY */ 

LOADING_RATE <3> 

SHIPMENT_HANDLING_TIME 

I* SHIPMENTS/HOUR */ 

SHIPMENTS /* SET OF SHIPMENTS TO BE 

SENT FROM A PORT */ 
CARGOES I* SET OF CARGOES 

ORIGINATING FROM A 
PORT *1 

COMMODITY_IN <30>; /* % OF ANNUAL TONS OF 

EACH COMMODITY SHIPPED 
FROM A PORT */ 

DISPATCHER: PROCESS; /* CREATES SCHEDULES, 

STARTS VOYAGES, 
MONITORS SYSTEM, ETC. */ 

DISTANCE: PROCEDURE; /* FUNCTION THAT COMPUTES 

DISTANCE BETWEEN A 
SPECIFIED PORT PAIR */ 


END; 



A FORMAL DESCRIPTION OP STETMSHIP CARGO OPERATIONS 


379 


DYNAMIC 

CARGO-GENERATION: PROCESS 
LOCAL PI, I, S, T, PD; 

SI: TAKE I N T E R_A RRI VA L_TI M E ; 

FOR EACH PO OF PORT DO 
IF PO.COMMODITY_OUT< COMMODITY-CLASS) 
NE ZERO 
THEN BEGIN 

T = ANNUAL_TONS*PO.COMMODITY_OUT 
/COMMODITY-CLASS); FOR EACH PD OF 
PORT WITH PD.TYPE NE PO.TYPE DO 
IF PD.COMMODITY_INVCO.MMODITY_CLASS> 
NE ZERO 
THEN BEGIN 

CREATE SHIPMENT NAMES S; S.STOWAGE- 
CATEGORY = RANDI(1,3); 

S. COMMODITY-CLASS ^ COMMODITY- 
CLASS; 

S. DESTINATION-PORT = PD; 

S. TONNAGE = (PD.COMMODITY-IN 
<COMMODITY_CLASS> * T) / 
INTER_ARRIVAL_TIME; 

FILE S IN PO. SHIPMENTS; END; 

END; 

GO TO SI; 

END C A RG 0 _G E N E RAT ION: 

DOCK-OPERATION : PROCESS 
LOCAL S, C, I, K; 

SI; TEST PORT.SHIPMENTS.LIST-COUNT EQ ZERO 

THEN WAIT CHANGE PORT.SHIPMENT.LIST_COU.NT; 
ELSE BEGIN 

S — ANY S OF rORT.SHIPMENTS; 

REMOVE S FROM PORT.SHIPMENTS; 

CREATE CARGO NAMED C; 

C. ST O WAG E_CATEG O R Y 
= S.STOWAGE-CATEGORY; 
C.DESTINATION_PORT 
“ S.DESTINATION_PORT 
C.TONNAGE = S.TONNAGE; 

DESTROY S; IUK-j-I; 

IF PORT.SHIPMENTS.LIST-COUNT NE ZERO 
THEN FOR EACH S OF PORT.SHIPMENTS 
WITH S.DESTINATION_PORT EQ C.DESTINA- 
T10NJ>0RT AND S.STOWAGE-CATEGORY EQ 
C.STO W AG F._C ATEG OR Y DO 



380 


R. J. PARENTE A.N'D D. F. BOYD 


BEGIN 

REMOVE S FROM PORT.SHIPMEXTS- 
C.TOXXAGE = C.TOXXAGE - S.TOXXAGE- 
DESTROY S: K = K-I; EXD; 

FILE C IX PORT.CARGOES ; 

TAKE K* PORT.SHIPMEXT_HAXDLIXG_TIME- 
EXD; 

GO TO SI; 

EXD DOCK_OPERATIOX ; 

VOYAGE: PROCESS 

LOCAL PO, PD, PORT, D, L, I, J, PE; 

SI: PO = VESSEL.SCHEDULE.PORT_OF_CALL'VESSEL. 
SCHED_POIXT /; 

IF VES SEL.S CHED_POI XT EQ VESSEL.SCHEDULE. 
XUMBER_OF_PORTS 

THEN VESSEL.SCH£D_POIXT = 0 ; 

PD = VESSEL.SCHEDULE.PORT_OF_CALL'YESSEL 
SCHED_POINT — 1>; 

IF PO.TYPE EQ SOUTH AND PD.TYPE EQ 
NORTH THEN 

BEGIN PE = NULL; 

FOR I = I TO VESSEL.SCHEDULE. 
XUMEER_OF_PORTS DO IF VESSEL. 
SCHEDULE.PORT_OF_CALL<I>.TYPE 
EQ NORTH 

THEN FOR J = I TO 3 DO 
IF VESSEL.TONS_BYJDEST <J,I> NE 
ZERO 

THEN PE = VESSEL.SCHEDULE. 
PORT_OF_CALL<I>; 

TEST PE EQ NULL 
THEN VESSEL.EOV = PD ; 

ELSE VESSEL.EOV = PE; 

EXD; 

VESSEL.STATUS = STEAMING: 

TAKE VESSEL.NORMAL-SPEED * DISTANCE (PO.PD); 
VESSEL.SCHED_POIXT = VESSEL.SCHED_POIXT A I; 
VESSEL.PORT-COUNT = VESSEL.PORT_COUXT A I; 
VESSEL.STATUS = BERTHING; 

TAKE PD.AVJBERTHIXGJDELAY ; 

VESSEL.STATUS = IN_PORT; 

START DISCHARGE NAMED D WHERE PORT = PORT, 
VESSEL = VESSEL; 



A FORMAL DESCRIPTION OF STEAMSHIP CARGO OPERATIONS 


381 


START LOADING NAMED L WHERE PORT = PORT, 
VESSEL =-- VESSEL; 

WAIT END ALL D,L; 

IF PD EQ VESSEL.EOV 

THEN VESSEL. VOYAGE_COU NT = VESSEL. 
VOYAGE_COUNT+I; 

GO TO SI; 

END VOYAGE; 

LOADING: PROCESS 

LOCAL C, F, TIME; 

FOR EACH C OF PORT.CARGOES DO 
BEGIN 
F = ZERO 1 

FOR I = I TO VESSEL.SCHEDULE.NUMBER_OF_PORTS 
WITH VESSEL.SCHEDULE.PORT_OF_CALL<I> 

NE PORT 

WHILE F EQ ZERO DO 
IF C.DESTI NATION EQ VESSEL.SCHEDULE. 
PORT_OF_CALL<I> 

AND C.TONNAGE LE VESSEL. AVAIL. 
STOWAGE(C.STOWAGE_CATEGORY> 

AND (VESSEL.TOTAL_DWT + C.TONNAGE) 
LE TOT AL_AI,LOWABLE_DWT 
THEN BEGIN 

VESSEL.TOTAL_DWT == VESSEL. 

TOTAL_DWT + C.TONNAGE; 
VESSEL. AVAIL.STOWAGE 
<C.STOWAGE_CATEGORY> 

= VESSEL. AVAIL.STOWAGE 
<C.STOWAGE_CATEGORY> - C. 
TONNAGE; 

TONS_BY_DEST<C.STOWAGE_ 
CATEGORY, I> = TON S_BY_DEST 
<C.STOWAGE_CATEGORY, I> 
C.TONNAGE; 

TIME = TIME -r (C.TONNAGETORT. 
LOAD I NG_RATE<C.STOWAGE_ 
CATEGORY 1 ); 

F — I; 

DESTROY C; 

TAKE TIME; 

STOP; 


END LOADING; 



382 


R. J. PARENTE AND D. F. BOYD 


DISCHARGE: PROCESS 

LOCAL T, I, J, TIME; 

TIME = ZERO; 

J = VESSEL. SCHED_POINT; 

FOR I = I TO 3 DO 
BEGIN 

T = VESSEL.TONS_BY_DEST<I,J>; 
VESSEL.TONS_BY_DEST<I,J> = ZERO; 

VESSEL. AVAIL_STOWAGE<I,J> = VESSEL.AVAIL 
STOWAGE<I ,J> — T ; 

TIME = TIME -f- (T* PORT.DISCHARGE_RATE<I>V 
END; 

TAKE TIME; 

STOP; 

END DISCHARGE: 

DISPATCHER: PROCESS 

O 

o 

© 

END DISPATCHER; 

DISTANCE: PROCEDURE (PO,PD) 

© 

O 

© 

END DISTANCE; 

EXECUTE DISPATCHER; 

7. CONCLUSIONS 

As suggested in the introduction, models of the type discussed in this article 
have a significant potential role as a tool for industrial management. This is 
particularly true when model outputs are extended to include financial evalua- 
tions by the addition of cost parameters and accounting relationships, a 
straightforward, though nontrivial, task that is not treated here because <h 
space limitations. 

The model described, augmented by a financial framework, is viewed as 
having a continuing relevance to a range of management problems. There are 
strong parallels between the kinds of experiments and evaluations that can 
be made with this model and the requirements of planning and analytical 
staff work carried on by most companies. Much of the scope of repetitne 
forecasts and other projections, together with special studies and evaluations 



A I OP MM DESCRIPTION Or SITWI^HIP CARGO OPERATIONS 


38 3 


of alternative courses of action, falls within its capabilities, which might be 
described as “staff analogs.” This model is not viewed primarily as a replace- 
ment for a human staff but rather as a means of extending staff capability* 
both in scope and speed of response to management. 

A number of anticipated areas of application in the context of scheduled 
commercial steamship cargo operations are listed: 

1. Evaluation of alternative vessel scheduling strategies under various condi- 
tions. Sets of assumed conditions, defined and simulated, could range from a 
dose approximation of present cargo traffic to expected levels in various future 
time periods. Unusual conditions, such as heavy cargo backlogs developed in 
tiie course of work-stoppages, could lie usefully represented and the model 
tKcd to explore and evaluate various “recover}* ’* scheduling strategics. 

2. Evaluation of the over-all economic effects of general or selective increases 
or decreases in the In, cl of cargo hookings , Such evaluations should provide 
useful inputs for planning and directing sales effort. 

3. A more comprehensive evaluation by the company of the constraints 
imposed on its cargo operations by reason of its government subsidy contract . 
This evaluation would entail the* simulation of operations with n subsidy re- 
striction such as the “min-mnx” for sailing requirements, then simulating 
operations under the same conditions without the restriction, and comparing 
the two over-all economic performances. 

4. Evaluation of various fleet configurations . The model could provide 
greater flexibility in altering assumptions of number, capabilities, and operating 
characteristics of the vessels of the company fleet. These evaluations should 
provide useful inputs to planning vessel assignments and in developing vessel 
construction programs. 

5. As a forecasting device. The model could serve as a useful forecasting 
device when the assumptions used reflect expected future conditions in the 
business environment and mode of operations. 

8. ACKNOWLEDGMENT 

The language discussed in this article was developed at the Advanced Systems 
Development Division of IBM by a group that includes K. R. Blake, II. S. 
Krasnow, ft. M. Leavenworth, and S. C. Pierce, IftM, ASDD, and G. P. 
Hlundcn, IBM, l\K. 

UNE DESCRIPTION FORMELLE DES 
OPERATIONS D’UN PAQUEBOT CARGO 

RtSUMfi 

l n nvfdsle du <erv ice paquebot cargo tsl decrit. Ixs chnrnctCristiquts phvsiques 
c^'CiuicHts d’un tel service cargo ^ont mtmionecs nvee Ics com mints propres 
nux operations do ccitc compagnk paquebou Unc langagc experimentak 



384 


P. I. WELDING 


formelle pour description du systeme est ensuite presente et ses elements sont 
definis. Les elements du service sont alors presentes dans le cadre de langa? 
descriptive. Finalement le modele resultant est defini dans la langage ex- 
perimentale dans le but de demontrer ses particularity essentielles dans son 
contexte. Une discussion des utilisations potentielles du modele simulation de 
ce genre, pour aider la direction, est egalement compris dans cette article 


OPERATIONAL RESEARCH IN URBAN 
PUBLIC PASSENGER TRANSPORT 
IN THE UNITED KINGDOM 

La Recherche Opwationnelle dans les 
Transports Publics Urbains dans le Royaume- Uni 

P. I. Welding 

London Transport Board \ United Kingdom 


1 . INTRODUCTION 

There is a danger in a review article of this kind of simply cataloging a whole 
series of unrelated research projects without being able to give an over-all 
picture of progress in research into the business of urban passenger transport. 
To try to avoid this danger I have written this review around the framework 
of a simple model of the important elements and activities that go to make up 
the provision of this particular kind of service. The framework I have used 
is represented diagrammatically in Appendix 1. I have confined my article to 
operations research into those activities of the industry peculiar to transportation 
and have not included reference to the contributions made by operations re- 
search in other large and vital activities such as stock keeping or engineering 
common to many other industries. 

The diagram is intended solely for illustrative purposes and should not 
be taken as representing a complete and accurate model of the industry. In 
this review I deal with each of the blocks as subparagraphs but before doing 
so will say something about the structure of the urban public passenger industry 
in Britain. 


2. THE STRUCTURE OF THE INDUSTRY 

A distinction should first be made between London and other cities and conur- 
bations. In London the London Transport Board, a nationalized body, operates 
both the extensive underground railway system and the bus system. Bntis 



OPERATIONAL RESEARCH IX URBAN TRANSPORT IN THE UNITED KINGDOM 385 

Railways, also nationalized, operate an extensive system of suburban lines into 
London. In other cities the predominant form of public transport is the bus, 
although British Railways similarly provide suburban rail services into most 
of the larger cities. In the United Kingdom the majority of urban bus lines 
arc municipally owned. A few urban companies and a number of companies 
operating long- and medium-distance services into cities arc owned by two 
large bus groups in which the government has a substantial holding. Thus 
the urban public transport industry is predominantly public-owned in one 
way or another. 

As far as operations research is concerned, British Railways and London 
Transport have full-time groups of their own. Some of the larger municipalities 
have access to operations research specialists or alternatively have drawn in 
outside operations research consultants or universities. 

3. THE TOTAL TRANSPORTATION-LAND USE PROBLEM 

The most notable development in urban transport over the last decade has 
been the evolution of the “Transportation-Land Use” survey. Until it 
appeared public urban transport had been regarded as an enterprise in its 
own right rather than part of a larger transport complex. Transportation- 
Land Use surveys owe a great deni to operations research philosophy in the 
use of statistical methods of analysis and the formation of mathematical models. 
They arc based on the concept that transport needs arc directly related to the 
type and density of land use. Conversely, land use is closely related to accessi- 
bility in the transport sense. 

The use of Transportation-Land Use surveys in Britain has followed earlier 
developments in the United States, The earliest British study, made in London, 
began in 1961 [1], Since then studies have been started in several conurbations, 
including Glasgow, Tcesidc, Tyneside, Merseyside, Manchester, Birmingham, 
West Riding of Yorkshire, and in many smaller towns and cities. 


Mode of Travel in London 
(per cent of passenger journeys) 


Mode of Transport 


Central 

Area 

Cars and cycles 

44 

21 

Buses 

35 

2S 

Railways 

19 

4S 

Others 

2 

3 


100 

too 


The emphasis in British studies has been somewhat different; it has reflected 
the difference in conditions and problems. In the United States the role played 
by public transport in all but a few of the major cities, such as New York and 




386 


P. I. WELDING 


Chicago, had already fallen to a small proportion of the total transport demand 
when these surveys were developed, so that the main problem was to predict 
the growth of private transport and to devise a road network adequate to deal 
with it. In Britain, by contrast, the proportion of total passenger transport 
provided by the public sector is much higher. Therefore the problem of “ modal 
split” between different forms of transport and its possible change in the future 
assumes considerably more importance. 

Necessity has been the mother of invention in this field as in many others 
The need for transportation studies has developed from problems in both 
public and private sectors of transport which have become increasingly acute 
in recent years. As far as the private sector is concerned, the growth of private 
automobile traffic in Britain has far outdistanced the rate at which roads have 
been built or indeed, with the available resources, could have been built since 
World War II. In the public sector the fall in traffic has not only affected the 
financial viability of sendees but the resultant worsening in services has reduced 
their attractiveness. The failure of road-building and traffic-engineering 
improvements to keep pace with the growth in traffic has led to aggravated 
congestion, which in turn has affected the speed and reliability of bus services. 
As shown by the following table, this deterioration has been reflected in a 
rapid loss in passengers on bus services compared with rail. 


Passenger Traffic into Central London 
During the Morning Peak (07,00-10.00 Hours) 
(thousands) 



1954 

1964 

Percentage Change 
1964:1954 

Public transport 

British Railways 

396 

492 

+24 

Underground* 

370 

397 

+ 7 

Road sendees 

269 

191 

-29 

Total 

1035 

1080 

+ 4 

Private transport 

Private car 

55 

98 

+78 

Motor cycle/scooter 

10 

16 

+ 60 

Pedal cycle 

13 

6 

-54 

Total 

78 

120 

+ 54 

Grand total 

1113 

1200 

+ 8 

By rail 

766 

889 

+ 16 

By road 

347 

311 

-10 


* Excludes 99,000 passengers in 1954 and 123,000 in 1964 who started 
their journeys on British Railways. 



ommrowi MLMAmn in ruiiAN urnsroin in nn pmiid kingdom 3S7 

The result of looking .it the transport complex as a whole has hern that 
public transpert operators !u\e luen brought into the Steering and Control 
Committees running Tiatisporlation-L md l'*e Mtuhes in the various con- 
urbations, which has led to a great upsurge ol interest in research into tiancpott 
planning. Also, m many smaller towns, such as Basingstoke, the attitude of 
planners toward catering to ihe nuds of public transport m terms of town lay- 
out has been rehashing. 


4. MODAL SPLIT 

The feeling that the piohlctn of congestion in British cities might he mote than 
could he cnercome simply by expenditure on roads led the Minbtei of Trans- 
port to a ( k l’rolessm Buchanan to make a study of the* problem [2], taking into 
account especial!} the effect on amenity and human environment and on tire 
histone heritage of cities of various methods of dealing with the transport 
problem. In spite of the admitted limitations m depth (which lias ltd to some 
criticism [3]) to which the study was able to he taken in the time available, the 
report has made a considerable impression in Britain mainly because it had a 
reality for the general public that previous work in the field of transportation 
studies had lacked. 

The principal conclusions reached were that only in the smaller towns 
could the entire transport needs he met by private transport, and even then 
certain Factions of the community, such as children and old people, would 
tend to become immobilized. The larger the town, the lc tc > the proportion of 
transport needs that could be dealt with by private transport, until m the carte 
of London only about 20 per cent of commute? traffic could come by car even 
with a high expenditure on reshaping towns. It was also pointed out that in 
comparison with the expenditure on vehicles the amount of money spent on 
roads and i construction of cities was small, li 1ms rince been made clear, 
however, that the scale of money available from public funds, although higher 
than before, is not likely to he on the* scale required for the more ambitious 
possibilities envisaged by Professor Buchanan. The net conclusion is that the 
contribution to be made by public transport is likely to be considerable into 
the foreseeable future. 

Work has also beta done In the Hoad Research Laboratory [4] aimed at 
clarifying the relationship between public and private road transport in tit ten. 
This work demonstrated the advantage in flexibility that the private car offers 
compared with bus travel but shout d that it would be impo sible to carry all road 
pn^enger traffic by private car with the existing occupancy arid type of vehicle. 

Purthcr studies relevant to modal rplit arc in hand by the Urmeisity of 
Birmingham \S), and in London a souopsychological investigation is being done 
to probe attitudes toward and preferences for different forms of tram pot t. 


5. type or snnvici: provided 

With (lie* exception of London, the prmtipd form of public transport in the 
majority of Btttbh conurbations is the bus. Several cities have a limited amount 



388 


P. I. WELDING 


of suburban electric railway with links into or across the central area, but only 
one, apart from London, has an underground system and that, Glasgow is 
limited to a simple circle around the center. These rail systems mainly sene 
suburban and longer distance commuters. 

As far as rapid transit services are concerned, there is a move parallel to that 
in the United States toward construction and extension of rail services, but the 
economics of rapid transit are such that it could make a worthwhile contribution 
only in the larger cities and conurbations. In London extensions to the tube 
system that will form a significant part of the over-all transport plan for the 
conurbation have been proposed. In the Merseyside conurbation there are 
proposals for electrifying certain existing surface lines in Liverpool to create 
an outer rail loop. An underground link across the city center would be pro- 
vided. Also a terminal loop on an existing underground line would be con- 
structed to give increased capacity and improved city-center facilities. 

Monorails are being seriously studied in connection with several cities and, 
in particular, Manchester, which is satisfied that some form of network of rapid 
transit in the city is essential, has recently announced a full-scale feasibility study 
and is carrying out operations research on various associated problems. 

The predominant type of bus service in Great Britain is provided by the 
two-man double-deck high-capacity bus operating on a graduated-fare system. 
In addition, a certain amount of one-man operation with single-deck vehicles 
uses graduated fares on quiet routes. In general the flat fare has not been found 
economical in British conditions of wage levels and distribution of journey 
length, but with the increasing cost of labor there is a move in that direction. 
Experiments are soon to be started in London, in which such services will be 
evaluated by operations research, and a proposal has been made in Sunderland 
to operate all the town services on flat fares. 

Another potentially important development is the use of some automatic 
means of collecting fares, based on a graduated scale, that would enable the 
bus to be operated by one man. Manchester has already experimented with 
electronic ticket cancelers and flat fares, and in London a start has been made 
on a comprehensive development in this direction for both buses and under- 
ground. In Runcorn, a town to be much expanded, the proposals include a 
rapid transit system to consist of a monorail or express buses on reserved 
tracks. 

These technical developments, combined with a reappraisal of the role of 
public transport, are leading to a considerable amount of new thinking and 
experimentation in most major cities in the United Kingdom. 

To take, for example, the city of Leeds, the scope of a widespread system 
of express buses operating with a flat-fare system is beginning to look more 
promising with the provision of a network of motorways and a determination 
on the part of the city planners to deal with the problem of congestion within 
the city center. The city is working closely with the newly formed Department 
of Transport in the University, which is engaged on various operations research 
studies, especially into the commuter and parking problems. 

Leeds has also carried out an experiment into “park and ride, using 



OPERATIONAL RESEARCH IN URBAN TRANSPORT IN THE UNITED KINGDOM 389 


express buses. Although this experiment in itself was not successful, it has led 
to a realisation of the importance of parking capacity in the city center for 
such services and has given rise to further research. Another suggestion made 
for the city was that transport within the center should he by means of small 
electrically operated bu*es that would ply at a low speed through shopping 
streets. This solution, similar to the mini-bus service in Washington, is favored 
by planners on the grounds of the loss of environmental values caused by modem 
diesel buses. As far as the bus industry is concerned, such vehicles have not 
found favor, for, with the struggle to meet rising labor costs in a falling market, 
the tendency has been toward larger buses. 

Leeds is also the subject of a “special partnership ” with the Ministry of 
Transport, with which it has regular liaison meetings. It will be used for 
experimental and study purposes as typifying the larger British city. 


0. DESIGN OF VEHICLES AND FIXED EQUIPMENT 

A field in which a great deal of valuable operations research has been done in 
public transport is that of design of equipment. Such studies are part of the 
general design knowledge on which transport systems must be based. As far 
as rapid transit is concerned, it is worth recording work by British Railways [6] 
on booking and passenger facilities and by London Transport [7] on capacity 
of passageways and design of stations for automatic fare collection. Both 
undertakings have done studies on layout of rolling stock, especially in relation 
to loading and unloading times, and doors. 

Work on assessing various types of bus and their equipment has been done 
in London, and a major investigation is in hand in connection with various 
experimental types of vehicle. The design of bus stations and the associated 
queuing problems have also been studied by simulation methods. 


7. THE TRANSPORT NETWORK AND ROUTING PROBLEM 

A great deal of work has been done on transport networks in connection with 
Transportation-Land Use studies. At the moment the available techniques 
extend only as far as the allocation of a fixed traffic demand to a network, given 
the requirements, for example, in terms of shortest route. Research is going 
on into programs for providing capacity restraint and other refinements. As 
far as is known, no program exists in the field of Transportation-Land Use 
studies for providing any kind of optimum network. The present procedure 
in such studies is, in effect, to choose one or several initially reasonable networks 
and, after alignment of traffic demand, to improve the characteristics of the 
network* more or less by visual inspection and recycling. 

It would clearly be of great benefit if more sophisticated methods of devising 
optimal networks could he developed, and the work done in Wallasey is there- 
fore of great interest as a step in that direction. This work [8] i$ described in 



390 


P. I. WELDING 


full in another paper and I do not therefore propose to say more about it 
Routing problems for buses are also being studied in Bristol. 

Work being done by the Network Analysis Group at the London School of 
Economics must also be especially noted. They are at present carrying out 
assignments of traffic to networks for London. 

Not only do operations research techniques require extension in the direction 
of optimizing simple networks but there is a need for methods of considering 
various forms of transport together and as alternatives and for deciding on prob- 
lems such as the scope of express services and the optimum frequency of 
stations and stops. 


8. SCHEDULING 

Having decided on the type of vehicle and service to be provided, the route 
pattern and the frequencies of service, the next problem, common to both road 
and rail, is to provide a time schedule or timetable and a duty schedule or roster 
that will meet the required demand or something close to it with the least ex- 
penditure of resources. This problem can be seen as two inter-related stages, 
first to fit the timetable to the passenger requirement and second to fulfill the 
timetable with the minimum resources of vehicles and men. Simple though 
this problem is to state, its mathematical solution has proved to be extremely 
intractable, and this article is a convenient way of drawing attention to it. At 
present, vehicle and crew scheduling in public transport is done manually. A 
mathematical solution, potentially, might produce (a) savings in cost of pre- 
paring schedules (although the amount of manpower involved is not usually 
great in relation to the cost of the operation), (b) savings in vehicles and crews, 
and (c) a more flexible operation resulting from speedy schedule compilation. 

Of these three potential benefits (b) is the most important, and in the early 
stages of research in this field great hopes were entertained that it would be 
possible by means of computers to produce considerable economies. After 
many separate attempts, for example, in Oxford, Manchester, London, the Til- 
ling Association with Liverpool University, and in other places in the United 
Kingdom (as well, of course, as a great deal in the United States), to produce a 
solution to the bus problem, and a recent study in London of the underground 
scheduling problem, it must be stated that the result so far has been disappoint- 
ing. One of the difficulties is the great variety of local industrial agreements 
that must be included. The only bright spot is that British Railways have 
developed a program for minimizing the number of units of rolling stock re- 
quired to operate a given timetable, which has been applied to the Edinburgh/ 
Fife suburban passenger services, and are hoping soon to develop practical 
programs for locomotive scheduling and crew rostering. 

Perhaps the most unfortunate consequence of this inability to break throug 
is that without a mathematical solution to this problem and the network problem 
it is difficult to conceive of a complete optimization of an urban passenger 
transport system. 



392 


P. I. WELDING 


Another approach to overcoming disturbance in a route is to introduce 
redundancy, either in the form of vehicles standing by, in additional layover 
time at termini to absorb irregular running, by increasing running time, or by 
subdividing routes into shorter sections. In London a good deal of operations 
research has been done both on the theoretical aspects of regularity (e.g. [9]) 
and on various practical experiments. Of course, the introduction of redun- 
dancy increases cost of operation and brings human problems, and so does 
not in any case provide a completely satisfactory solution for the bus operator 
When public transport is required to perform a significant function in the total 
transport picture, the only satisfactory solution is segregation or priorities for 
public transport to increase speed and reduce irregular running. Although 
there are many proposals under consideration, no such major schemes have 
yet been introduced in the United Kingdom. 

On urban and suburban railways the problem is usually that at peak periods 
the system is operated near to capacity, and the inevitable consequence is that 
any slight disturbance caused by variations in passenger loading, a minor 
breakdown, or delay tends to snowball. 

There are in general two ways of overcoming such delays. First a great deal 
can be done in the direction of automated signaling, improved rolling stock, 
automatic driving, and better crowd-handling facilities. Second, control can 
be made more immediate and purposeful. These two developments tend to 
go together because it is often the introduction of more automation and improved 
signaling that leads to the communications needed for better control. In London 
a simulation of the operation of a line under congested conditions has been 
developed and validated [10]. Among the uses to which it has been put are 
the prediction of the effect of improvements in train performance and the 
effect of a new system of controlling intervals developed by the Paris Metro. 
A research project now in hand is aimed at setting the pattern for methods of 
control and communication and the design of future control centers. British 
Railways have also used simulation to study movements across the complex 
and congested Borough Market Junction in London. 


11. CONCLUSIONS 

The present is a period of growing interest and activity in research into urban 
public transport in the United Kingdom. This results from the changing role 
of public transport and from the widespread reconstruction of British towns 
and cities now in progress and being planned. 

The number of areas in which operations research methods have been and 
can be applied has been steadily increasing and will lead to a more all-round 
approach to the problems of the urban transport business. Two areas in par- 
ticular, in which available operations research techniques fall short of the re- 
quirement, are transport network theory and scheduling. 



OPERATIONAL RESEARCH IN URBAN TRANSPORT IN THE UNITED KINGDOM 393 


12. REFERENCES 

m London Traffic Surrey, Vo I 1, London Count} Council, 1964. 

[2] Traffic in 7oritts % II MSG , London, 1%3 

[3] M L Birxitv nnd J F Kain, Urban Form, Car Outtcrsfnp and Pttbltc Policy , 
Urban Studies, N 7 o\ ember 1964 

[4] R J Smhd, “The Traffic Problem in Tov ns,” Toten Planning Rev , Liverpool 
Unnersit} Press, 1964 

[5] F R Wiesov, “Traffic Assignment Modal Split, “ Traffic Engine ertnr and Control 
(November 1965} 

[6] M G UrNsrrr, “Operational Research in British Transport,” British Transport 
Rn (1961) 

[7] 1! E>. Hankin and U A Wright, “Passenger Flow in Sul?vva>s,“ Operational 
Res Quart 9, 2 (1958) 

[8] A II Lines, The Design of Routes and Service Requirements for a Municipal 
Bus Compnm , J F O R S Conf , 1966 

[9] I\ I Weeding, “The Instability of a Close Interval Service,” Operational Res , 
Quart 8. 3 (1957) 

(10) I*. L \Vjim\G nnd D. J. Day, “Simulation of Underground Railway Operation,” 

Rmliiny Gazette, 121, No 11 (1965) 


LA RECHERCHE OPERATIONELLE DANS LES 
TRANSPORTS PUBLICS URBAINS DANS 
LA ROYAUME-UNI 


RCSUMt 

Le travail preterite ici vent donne un resume dcs applications dc la recherche 
operationnclle dans la domainc du transport public urbain (pour passngers) 
dans le Rovaumc-Uni, ct encore des references nu\ autres domaincs principalcs 
dans lcsquclles la r.o. est usee, au\ applications specialcs ct aux domaint'S ou 
la recherche est cn dcvcloppcuncnt ou encore ncccssnirc. 

Actticllcmcnt tl s a grand interct ct activitc a la recherche. Cela depend 
premitremeni du role changeant du transport public et dc la large reconstruc- 
tion dcs villcs et cites britanniques actuellcmcnt en execution ou projetce. 

Nous concluons que le nombre elargi des domaincs d’applications de la r.o. 
conduit a unc approximation plus ttcmluc aux probl ernes industricls, mais il 
\ a .uivsi quclqucs domaincs avc c mcthodcs disponihlcs mais que nc donnent 
p ss dont on a besoin, cn particulicr la thcorie des rescaux dc transport ct 
Fordonn an cement. 



APPENDIX 1 



The problems involved in providing public transport. 


394 




















DESIGN or ROUTES AND SERVICE FREQUENCIES 


395 


THE DESIGN OF ROUTES AND SERVICE FREQUENCIES 
FOR A MUNICIPAL BUS COMPANY 

La Determination dcs Itineraires ct do la Frequence 
des Services d’unc Compagnie Municipale d' Autobus 

A. II. Lines, \\\ Lampkin, P. D. Saalmaxs 

Business Operations Research Ltd London 
United Kingdom 


1. INTRODUCTION 

This article is based on an investigation undertaken for the Passenger Transport 
Undertaking of a northern English town. The Undertaking is owned and 
operated by the municipal council as a public service, its day-to-day running 
being vested in a general manager who reports to a transport committee com- 
posed entirely of local politicians. 

In English local politics very definite views arc held about how such under- 
takings should he operated. The Conservative, or right wing, view is generally 
that this type of enterprise should be treated as a strictly commercial one, whereas 
the Labour, or left wing, view is that such activities fall within the general scope 
of public services, such as health and education, and their cost can he met at 
least in part out of taxes. The latter view is, of course, tenable only if the 
amount of any subsidy can be restricted to what can be reasonably borne out of 
local taxes. 

In common with most similar undertakings in England this town's bus 
services, which formed the subject of the investigation, arc losing about 5 per 
cent of their traffic each year because of the spread of car ownership. In con- 
sequence, what was once a profitable enterprise has now become a serious loss- 
maker. 

The general manager took the view that fare increases would only accelerate 
the traffic decline (at least partly true) and that the only way of countering the 
trend in losses was to cut costs bv eliminating the least patronized services, 
n course of action hotly opposed by those affected. In his analysis of the problem 
he formed the view that the position was being aggravated by the absence of 
any machinery by which routes and frequencies could be adapted to changes in 
the traffic problem. The existing route system and service frequencies had re- 
mained essentially the same for the past 30 years and had been designed to link 
up with railway and ferryboat services that now* bore no resemblance to the 
originals. The town itself had undergone major changes in residential and 
industrial location and density. What was needed now, therefore, was a com- 
plete replanning of the town’s bus services to suit the new conditions and costs 
and a method by which services could be adapted in the future to take account 



396 


A. H. LINES, W. LAMPION, P. D. SAALMAN5 


of further declines in traffic and to enable the Undertaking as 2 whole to b 
operated to achieve any previously stated financial objective with reasonab^ 
certainty'. 


2. APPROACH TO THE PROBLEM 

The first task that confronted the OR team in their investigation was to gettb 
transport committee to set their objectives. Once it was generally underfoot 
that the better the services offered to the public the greater the loss that would 
result, it was possible to obtain agreement that the OR team should trv to deter- 
mine the best possible sendees that could be provided within each of a number 
of stated financial objectives. Then the interested parties would at least have a 
quantitative basis on which to 2 r gue. 

This, of course, begged the question how to define “service,” a matter on 
which politicians are uniformly vague. The measure of sendee eventual!? 
chosen was the time spent by those members of the public who were prepared 
to use buses in waiting for and traveling on buses, together with the time spent 
walking at either end of their journeys. This is referred to as the “total travel 
time,” and a system of routes and frequencies that results in a low travel time is 
assumed to be better, in the sense that it provides a greater public sendee, than 2 
system giving a high total travel time. In practical terms a low total travel time 
will be achieved if buses run at close time intervals along most streets of t It 
town. 

The traveling requirements of the public had next to be defined. The 
geographical area served by the Undertaking was divided into small sectors, 
and the traffic arising from or terminating at each sector v?a s assumed to be 
concentrated at 2 nodal point, usually placed symmetrically. In all, some 55 nodal 
points or “ nodes ” were used to define the traveling requirements in the form of 
a “demand matrix.” Traveling habits are dependent on the time of day, but 22 
effective limitation on the number of matrices that may be used to define traveling 
requirements is the preference of the public for a timetable they can memorise. 
For week days, therefore, only four matrices were used: morning rush-hour, 
evening rush-hour, midday, and late evening. For Saturdays two matrices and 
for Sundays one matrix were found to be sufficient. 

The data from which the matrices were constructed were obtained by using 
special ticket machines that record the origin and destination of each passenger 
and by a limited number of surveys taken at points of interchange and in areas 
in which serv ices were infrequent or widely separated. The actual figures usee 
in each matrix were based on a sufficient number of observations to give average 
values, correct within the significance levels necessary'. 

One feature of the situation was that the proportion of the public who were 
prepared to travel bv bus was fairly insensitive over the short time to the details 
of the services provided. Moreover, an analysis of costs established that the 
only* major variable was the number of buses operated. Consequent!}, the 
financial outcome of anv operating policy could be almost entirely predict 
from the size of fleet used. The tasks that had to be performed were: 



DESIGN OF ROUTES AND SERVICE FREQUENCIES 


397 


1. The choice of a set of routes. 

2. The allocation of bus frequencies to routes. 

3. The provision of detailed timetables. 

4. The scheduling of buses to work the timetables. 

5. The scheduling of crews to the buses. 

It can be seen that all these problems would interact and that a strictly 
global model should be used. In practice, we were not able to produce a com- 
putationally feasible global model and the problem had to be uncoupled and 
the various tasks performed separately: 

1. An heuristic algorithm was used to produce a ° good ** network of routes. 

2. Frequencies were allocated to routes so as to minimize the total travel 
time for each demand matrix for each of a range of fleet sizes. 

In this way a range of solutions illustrating the consequences, in terms of 
routes operated and frequencies provided, of pursuing any of the alternative 
financial policies was produced. The transport committee then selected that 
policy which best fulfilled their sense of public duty and the OR team proceeded 
to the concluding phase of the job: 

3. Drawing up timetables. 

4. Scheduling buses. 

5. Scheduling crews. 

Conventional methods were used for (3) and (5), but the problem of sched- 
uling buses proved an interesting application of the linear programming 
assignment model. 


3. AN HEURISTIC ALGORITHM TO CHOOSE ROUTES 

The first difficulty in trying to design a network of routes is the lack of any 
dear criterion regarding what is a good network of routes and what is not. 
What we have said in earlier sections indicates that there cannot be a real 
criterion in the absence of knowledge of frequencies to be run on the routes. 
Without a real design criterion, what must be done to design a network is to 
think of properties one Avould expect to be present in a “good 1 2 3 4 * network of 
routes and then to produce an algorithm that will ensure that a network de- 
signed by the algorithm will have all the properties. Some properties one 
would expect from a good network arc the following: 

1. Most journeys for which there is an appreciable demand will be possible 
without changing. 

2. Routes will be reasonably direct and not meander excessively. 

3. Routes will criss-cross often to facilitate changing. 

4. There will not be too many routes. 

The algorithm consists of the following steps: 

1. Producing an initial skeleton route of four nodes {a skeleton route con- 
sists of an ordered sequence of nodes that will appear in the final route but 



39S 


A. II. LINES, \V. LAMPKIN, r. D. SAALMANS 

usually with other nodes inserted between each of them. The first and b~ 
nodes in the skeleton will be the termini of the final route). 

2. Inserting nodes one by one into the skeleton until a complete rom*v 
obtained. 

3. Eliminating from the demand matrix those demands satisfied hv ^ 
route. 

4. If the demand matrix still contains significant demands, return to f\) 
to produce another route. 


In order to perform these steps, an objective function is needed to jntK 
skeleton routes so that the best four node skeletons can be chosen and a criterion 
to judge which node is the best to insert into a given gap in a skeleton. In fra 
the first implies the second, since the best node to insert in a given gap in a 
skeleton will be the node that, when included, causes the biggest increment m 
the objective function. 

The objective function used had three components; 

1. The total passenger miles for the route already guaranteed hv tie 
skeleton. (This component is called the flow of the skeleton route and entires 
that the route structure will have property 1.) 

2. The second component is included with the intention of ensuring pro- 
perty 2. It takes the form of subtracting the product of a weighting factor* 
which depends on the number of nodes in the skeleton and the total length of 
the skeleton route. 

3. The third term in the objective function is intended to encourage 
property 3. It takes the form of adding the product of a weighting factor and 
the sum over all nodes in the route of the number of previously chosen routtt 
passing through the node. 


When considering the nodes that should be included in a gap in a skeleton 
route, it would be expensive to have to evaluate increments to corrected f!oa 
for all nodes. Because of this, a test is included to eliminate many nodes im- 
mediately. Suppose we are to insert a node k between nodes r and s . Node 
k is eliminated immediately unless the distance from r to s via k is less than 1.5 
times the distance from r to s. 

A second, more complicated, test is designed to prevent the route from 
crossing itself. In effect, it eliminates node k if there is a node already in the 
route on the direct path from node r to node k on the direct path from node t‘ 
to node s. 

When a complete route is chosen, all demands for journeys directly served 
by the route are eliminated from the demand matrix, and demands for journey 
for which the route passes close to the nodes in question are reduced by 3 
factor depending on how close to the nodes the route passes. 

The algorithm produced routes that appeared sensible to management, 
when frequencies were allocated to them gave better values for total travel, 
time than the best that could be achieved with the current routes. 



DESIGN OF ROUTES AND SERVICE FREQUENCIES 


399 


4. THE COMPUTATION OF TOTAL TRAVEL TIME 
FOR A FIXED SYSTEM 

Wc denote the total travel time for all passengers during some period of the day 
by Z, Minus Z is a measure of the utility of the bus sc nice. It is clear that Z 
must have the following characteristics : 

1. Z must decrease as the frequency of a route increases. 

2. If there is no route through a particular node, but there is a route 
through a node that is physically near, the effect on Z should be less severe than 
if the nearest node served were some distance away. 

3. If there is a demand to travel a fairly short distance on an infrequent 
route, much of the demand may be lost because it will often be quicker to walk 
than to unit for a bus. Introducing an infrequent service on a route with mostly 
short-distance travelers should not decrease Z by much. 

4. Other things being equal, a system of routes that crosses often and so 
provides the most possibilities to passengers to change efficiently should have a 
lower Z than one that docs not. 

5. The contribution to Z for journeys from A to B should take account 
of all routes from A to B that could reasonably be used. 

The description of how Z was calculated and how this ensures that Z has 
all the necessary characteristics is given below. 

Z was an estimate of the total tra%*el time spent by all travelers, including 
in travel time a waiting time of one half the intcrarrival time for any route 
used. This ensured that Z had characteristic 1. If there were no route through a 
node, it was assumed that the passengers walked at 3 inph to the node that 
would minimize their total journey time, including walking, waiting, and riding. 
This total time was the time used in providing a contribution to Z and this 
provided characteristic 2. 

It was assumed that demands for any particular journey were produced at 
random and that if, when a demand occurred, it was quicker for the person to 
walk to his destination at 3 mph than to wait for a bus, he did, in fact, walk. 
The time for the journey used in the contribution to Z was the expected time 
given in equation (1). 

((f) = T" (r + /) 60) it 4 T j AO) H, 0) 

*0 

where r — bus journey time, 

T sr walking time, 

<!>{() dt probability of bus arriving in 

/““ frequency (intcrarrival time between successive buses). 

The procedure to calculate Z consisted of setting up a matrix of times for 
journeys from A to B. First, walking times were put in the cells, but when 
direct bus routes were available times for waiting and riding or times for walking 
or waiting and riding were entered. Then the matrix was "improved”: this 



400 


A. H. LINES, W. LAMPKIN, P. D. SAALMANS 


consisted of systematically looking at each (ij) element of the matrix and 
seeing whether there was a k for which the journey times i~^k, k->j totaled 
less than the journey i-^j. When this was so, the journey time was replaced b 
the total. The procedure used was that suggested by Murchland [1] and produces 
the matrix of minimum ties for all journeys in an N x N matrix in onl 
N(N —l)(N — 2) such operations. This procedure for calculating Z implied 
characteristic 4. 

The problem of calculating contributions to Z for journeys from A to B 
when more than one route runs from A to B was rather more complicated. It was 
assumed that the different routes were scheduled independently and that 
passengers arrive at random. It would have been nice to assume that passengers 
then chose the route that would get them to their destination soonest, but un- 
fortunately this situation was difficult to analyze. What was done was to assume 
that passengers ignore obviously bad routes (e.g., one for which the journey time 
is greater than the journey time plus one half the interarrival time for the best 
route) and choose the first bus to arrive from one of the other routes. For the 
simplest case in which there arc two routes to be considered with interarrival 
times t\ and h (h <*2) and walking is out of the question the average waiting 
time used in Z is given by 



If n and T 2 arc the bus journey times for the two routes, the average journey 
time is given by 



More general cases were handled in the procedure used. 


5. MINIMIZING TOTAL TRAVEL TIME 


By taking the routes as fixed, Z could be regarded as a function of the frequencies 
allocated to the routes. If N denotes the number of routes, and // , the frequency! 
(interarrival time) on the route in minutes, Z could be written as Z{j\,fi, 
or Z(J), where / denotes the vector (fufz , . . . ,/aO- F° r each route i a function 
ei(fi) could be calculated which was the forecast number of buses needed on 
route i to main a frequency/* ; e* was not necessarily an integer, for it would often 
be possible to share buses between routes. In practice we found that iff?* is 
the round trip time for route i (including statutory layover time) e* could be 
satisfactorily forecast by 


ei(fi) = min 


0.3 + 


(RtX 

'[ ft) 


( 4 ) 


f We apologize for using the word “frequency," which usually means a rate, or 
the inverse, an intcrarrival time, but this is the way bus operators talk and we aw go 
into the habit. 



design or routes and service frequencies 


401 


ivlicrc (Rtlft) denotes the smallest integer greater than or equal to Rilft* The 
financial policy of the organization dictated the total fleet size, which is denoted 
by 1\ so the problem of choosing frequencies could be written as 

min Z{f^h fsl (5) 

subject to 

2'Af<)<F- 

ir 1 

In fact, since it is obvious that all of the buses must be used to obtain minimum 
Z , a further restriction on the space to be searched was included: 

i r t (fi)> F~\. (6) 

i-1 

The possible values that ft can take are limited by the practical consideration 
that passengers like timetables which arc easily memorized. For this reason 
frequencies need to be simple factors of an hour* The frequencies actually 
considered were 5 min., 10 min., 12 min., 15 min., 20 min., 30 min., 40 min., 
60 min. 

Some of the routes connected with other transport services not in the control 
of the Undertaking. In these eases the frequencies of the route in question were 
constrained to he simple factors or multiples of the connecting service so that 
when timetables were drawn up subsequently it was possible to organize the 
connections conveniently. Thus for a route that connected with a 15 min. train 
service the only frequencies considered were 5, 15, 30, and 60 min. 

In a few cases routes run by other undertakings ran partly through the area 
under consideration. These routes were considered in calculating Z % but the 
frequencies on them were not varied to optimize the situation, since they were 
not under the control of the Undertaking. 

The model outlined so far is inadequate because it does not take into account 
the limited capacity of buses. Unless steps were taken to prevent it, the model 
could have suggested n system that could not deal with some of the demands. 
To prevent tin's, maximum utilization factors were produced for each of the 
routes in parallel with the calculation of Z. These factors were calculated by 
locating, for each route, the arc with the maximum number of passengers 
traveling along it and dividing the passengers per hour for this route by its 
hourly capacity. A further constraint on the minimization was that none of 
these maximum utilization factors could exceed unity* 

The method of minimum seeking used was a modified random search pro- 
cedure. The idea was that an initial gucs-cd solution started the procedure and 
that thereafter new values of/ were produced by random perturbations from the 
best / found to date. Two novel features were that the size of the perturbations 
\wis made to decrease as the procedure continued, and more confidence could be 
placed in the current best /, and that when only small perturbations were being 
mod, and n new current best was located, those elements of / that had been 



402 


A. H. LINES, W. LAMPKIN, P. D. SAALMANS 


increased to produce the new optimum were given a tendency to increase 
further and those that had been decreased were given a tendency to decrease 
further. 

This method of minimum seeking has been reasonably successful, but we 
are not confident that it is the best possible and would be grateful to receive 
suggestions of better methods that would work when minimizing over such a 
peculiar, ill-defined discrete space. 


6. SCHEDULING BUSES 

The results obtained with the bus allocation program consisted of a list of 
routes and frequencies to be operated at various times of the day for each day 
of the week. From these results the timetables compiled took into account the 
rail and ferryboat services previously referred to. 

In order to assist with the implementation of these new timetables, bus 
schedules (lists of journeys for individual buses to operate during the course of 
each working day) were designed. Generally, the objective of the design of a 
set of bus schedules is to draw up details of the day’s work for each bus, more 
or less independently of crew schedules, so that the total number of buses needed 
for operating the timetables is a minimum. Conventional methods consist of 
tactical, commonscnsc techniques that do not necessarily result in the minimum 
number of active buses being used. The method used by the OR team was to 
minimize the total layover time at termini, subject to certain minima specified 
in trade union agreements. Layover time is the difference between the time a 
bus arrives at the terminus and the time it departs. For termini common to 
two or more routes this minimization procedure was achieved by interchanging 
vehicles between routes. 

A linear programming algorithm was used to assign available buses to depar- 
tures at each terminus. The list of available buses consisted of the following: 

1. Vehicles that had finished a journey at the terminus and could be as- 
signed at a cost equal to the layover time implied by the particular assignment. 

2. Vehicles standing in the garage that could be brought out and assigned 
to a departure at a high cost. 

The results obtained by this application of linear programming to scheduling 
gave a slightly lower number of active buses than that implied by (4) in Sec- 
tion 5. 


7. RESULTS OBTAINED 

A system of bus routes and frequencies for each of three financial policies was 
designed. 

1. An annual profit of £35, 000. 

2. An annual loss of £35,000. 

3. A breakeven point at which expenditure is equal to revenue. 



DL5JGN Or POt'TES VXD SOW ICE fREQrfNCIES 


403 


The calculations used were based on estimated costs for 1965 1966 and the 
current fares. The current financial state of the undertaking was such that the 
estimated loss for 1965 1966 uns £ 35,000. 

The results obtained for the profit policy (1) indicated that it was not pos- 
sible to achieve an annual profit of £35,000 with the current fare structure 
without lowering the le\el of service given to passengers to the point at which 
they were being left behind at bus stops. 

The results obtained for the loss policy (2) showed that it was possible to 
reorganize the current bus system to give better service to passengers if the 
Undertaking were to be subsidized to the extent of £35,000 per annum. 

"rile work earned out in connection with the breakeven policy (3) indicated 
that it was possible to reduce the present scale of operation during the peak 
periods to attain n breakeven of revenue and expenditure. This breakeven was 
to he achieved by operating the peak periods with 56 buses (instead of 70 which 
was the current practice) and reorganizing the route structure and bus fre- 
quencies. The resulting level of service given to passengers was estimated to 
be at most 3 per cent worse than at present, and the resulting revenue was 
estimated to he approximately the same. 

The level of activity in off-peak periods was unaltered, and reorganization 
of services made it possible to give a considerably better service in these periods. 
One of the reasons for altering the ratio between peak and off-peak levels of 
activity was to enable the operating crews to be usefully employed throughout 
their working shift. This policy involved discontinuing four existing routes and 
introducing three new routes during the peak periods. 

The Transport Committee and subsequently the Town Council decided to 
adopt the breakeven policy. 


8. FURTHER DEVELOPMENT 

The approach described in this article excited considerable attention from the 
bus operators of Britain, most of whom were facing the problem of how to 
adapt their services to rapidly changing traveling requirements. At the time of 
writing, similar studies were in h3nd for two operators and discussions were 
proceeding with others. 

We are convinced that our approach to bus transport studies is capable of 
rather more general application. Railways, airlines, and shipping services 
would nil appear to be amenable to such treatment with the important simplifi- 
cation that in mo-t the choice of nodal points, which presents certain difficulties 
for bus service*, is largely a fait accompli. 

Particularly in the ease of railways, however, there is an obvious inflexibility 
in the extent to which any solution could be implemented; benefits would have 
to be considerable to justify constructing a new line; but, when there are grounds 
for believing that n railway line should be closed, a proper quantification of the 
dfect on total travel time would provide rather more convincing evidence than 
did Dr. Beeching’s survey of British Railways. This was based on a onc-dav 



404 


A. H. LINES, W. LAMPKIN, P. D. SAALMANS 


sample of traffic at each railway station and the recommendations were made 
a line-by-line basis without any proper attempt at a comprehensive analysis ° 
Where this approach, in our opinion, has most to offer, is in planning the 
routes of new subways, underground railways, and monorails and in selecting 
station sites. Most such construction (e.g., the London Victoria line) is currently 
made on the basis of traffic surveys and “ commonsense ” or occasionally by the 
use of cost benefit analysis. The ideas underlying this case study would seem 
to provide a better measurement of “ benefit” than that usually obtained 


9. REFERENCE 

[1] J. D. Murchland. “A New Method for Finding all Elementary Paths in a Complete 
Directed Graph,” L.S.E.-T.N.T., 22, London School of Economics report. 

LA DETERMINATION DES ITINERARIES ET DE 
LA FREQUENCE DES SERVICES D’UNE COMPAGNIE 
MUNICEPALE D’AUTOBUS 


RESUME 

L’investigation d’une entreprise d’autobus faite dans une ville anglaise est 
decrite. L’entreprise perdait 5% de ses voyageurs chaque annee et travaillait 
a perte. Son reseau d’itineraires et sa frequence de services etaient restes 
inchanges pendant 30 ans et avaient grand besoin d’une profonde revision, 

La mesure d’utilite ou “service public” de cette compagnie fut definie et 
denommee “ duree totale de voyage.” 11 fut demontre que plus le service public 
rendu etait grand, plus l’entreprise etait deficitaire. 

Un procede de determination d’une structure specifique de besoins fut 
elabore afin d’etablir une relation entre la duree totale de voyage et un meilleur 
rendement financier. 

Un travail analytique fut entrepris concernant trois sujets de premiere 
importance: le developpement d’un “heuristic algorithm” pour choisir des 
itineraires; la redaction d’un programme sur ordinateur pour determiner la 
frequence pour chaque itineraire dans le but de minimiser la duree de voyage; 
et le dessein d’un systeme de repartition des autobus afin de definir un horaire. 

L’article conclut avec quelques suggestions pour une plus large application 
de cette approche concernant l’etude des reseaux de transport par air, mer et 
rail et ses possibles contributions dans le choix d’itineraires pour souterrains 
et monorails. 



PARTITION HD GRAPHS APPLIED TO URBAN TRANSPORTATION PLANNING 405 


SHORTEST DISTANCES THROUGH PARTITIONED GRAPHS 
WITH AN APPLICATION TO URBAN TRANSPORTATION 

PLANNING 

Plus Com ics Distances par la Methodc dcs Graphes 
Cloisonnccs avee unc Application a la Preparation d'un 
Systemc dc Transports Urbains 

Charges W. Bi.umi:ntiutt 

Texas Transportation Institute 
United States of America 


1. INTRODUCTION 

As defined by Berge [1J, a graph is recognized by the existence of (a) a set A r 
and (h) a function F mapping X into X, 

In this article .ve-Y is considered to be a point in a plane and is termed a 
vertex of the graph. The relation x' e F.v designates a connective property 
between the vertices x and x* and is termed an arc of the graph. An arc implies 
orientation, and the arc u is symbolized by tt — (.v, .v'). Let G (A', T) = (A', U) 
denote the total graph, where U is the set of arcs for the graph G. An edge is 
characterized by xt and Xj e X such that (xi,xj)e U or (xj f X() e U. 

The name path is given to a sequence ui of arcs of a graph G ~ (A r , U) such 
that the terminal vertex of any arc except the final one coincides with the 
initial vertex of the succeeding arc. If a path /t meets, in order, the set of 
vertices {.vj t A*e , . . . , x r , x r * i }» the notation /i =- (.vj , a *2 , . . . , x r , Xn i) is used. 
The corresponding concept for sequences of edges, rather than arcs, is a chain , 
A graph is connected if for every pair of distinct vertices there is a chain going 
from one to the other. A strongly connected graph has a path joining every pair 
of distinct vertices. 

A subgraph of the graph G ~(A r , F) is of the form G,c r ™ (S z , F.s,), where 
S T <zX and .v' e V$ J (x)- a *x' e (F.vr» «SY). Given a vertex .v, we denote by 
S x <r A' the set of vertices including .v winch can be connected by a chain to 
a\ A component of G is the subgraph determined by a set of the form S x » 

Theorem / [1], The different components of the graph G ~ (A\ F) constitute 
a partition of X, that is 

1. S x =£ kJ, where r is the null set. 

2 . 

3. aV x - A\ 

Theorem 2 [1]. A graph is connected if and only if it possesses only one 
component. 



406 


CHARLES W. BLUMETNMTT 


We define an arc u = ( a , £), ^ as incident out from a . Similarly 

- (6, *), 6 ^ a is an arc incident into a . If G = (A, [/) is strongly connect 
and X is partitioned into nonempty sets, S and S' ~ {X — S}, the arc u- 
(.v, *'), s e S , s' e S', is incident into S' and incident out from S. We denote 
U ( s' tS ) as the set of arcs incident from S' into S. When no ambiguity is present 
we may write £/</$') as Ug , etc. We also note that ‘ 

= The cutset. Us of G', G' = (Su5', £7), is 

f/.s* = G 7s u £/? == f/ u G ( ^) = G ( i v$) u G ( / >5) = u s , . 

Note that Us — Us f # 0, since G is connected. 

Theorem 3 [3]. A graph G = (X, G) £r strongly connected if and only if ; 
/or ere e)‘y partitioning {5,6"}, whej'e S' = {X — S}, 

Us 0 and 17/ ^ 0. 

Proo/. If for some G^, U$ = 0, there can be no path from any vertex in 
G$ to any in Gs r , so G is not strongly connected. 

If G is not strongly connected, choose s e S and s' e S' such that no path 
exists from s to s'. Choose as S the set of vertices consisting of s and all vertices 
connected from 5 by paths. Then Us = 0 . This completes the proof. 

Given Gs z — (S x , Ts x ) 9 Gs v = {S y , Tsjf if for some xeS x there exists 
y e Fx such that y e Sy, then Gs v may be termed adjacent to G$ r . Since the 
subgraphs Gs of the strongly connected graph G constitute a partition and 
S 0, then each subgraph Gs k is adjacent to at least one subgraph Gsjp £k). 

Theorem 4. Given the graph G~(X, U), where {S X} S y } — X, xeS Zl 
yeS fJ , then yeFx if and only if (x,y)e U ^ ZtSu ) . 

Proof. If y e Fx, such that y e S#, the arc connecting .v to y also connects 
subgraphs Gs x and Gs u , and by definition of a cut-set (x,y) is a member of 

U (h,s u ) • 

B If then y£ s v> hence y^ S A but, if yeS x , the arc 

joining x and y is within Gs z and does not join Gs x to Gs v - 


2. MINIMUM PATH OR SHORTEST DISTANCE PROBLEM 

Consider the problem in which, given a graph G = (A, G), a number /(«)>0 
is assigned to each arc u called the “length” of u ; find a path ft going from a 
vertex as X X o a vertex b e X such that its total length 

1K») 

ue ii 

is a minimum. We assume the connective properties of G, hence the existence 
of ft. Ford [4] and Moore [5] proposed computationally suitable algorithm* 
for this problem. The basic algorithm of index reduction outlined by For ^ 
repeated: 



PARTITIONED GRAPHS APPLIED TO URBAN TRANSPORTATION PLANNING 407 

Algorithm 1 

1 . Associate with ench vertex ,v* an index A* , where Af represents the distance 
of reaching vertex Xf from a particular vertex xq. 

2. Initialize Ao 0, A; -- oo f j ^ 0; let X( — .vo. 

3. Scan the network for an arc (.vi,.v;) such that A;— A f 4~/(*f , */)• For 
each arc that satisfies this inequality replace A; by Aj 4- /(.v< , xj). 

4. Continue this process of Step 3 for each Xf from which a new A / has 
been determined until no A; can be diminished further. 

There exists a vertex adjacent to vertex such that A/ — Aje-, =/(a>„ a*/). 
This is so because the last monotonic reduction of A{ was made from vertex 
t,. Similarly, let x>, be the adjacent vertex to .r*, from which a monotonic 
reduction of Ajt, was made. Since the index reduction procedure has caused the 
sequence A*, A*-,, A* af ...» to be monotonic nonincrcasing as it approaches Ao, 
there exists a vertex xi+% corresponding to thei ndex Af^i such that = a*o 
and Aj . j =0. 


Theorem 5. For a strongly connected graph G = ( -V, U), where a length 
parameter /(«) > 0 is assigned to each arc u t the application of the index reduction 
procedure yields the shortest distance Af from vertex xq e X to all vertices xi £ X 
(i ^ 0), and the path 

/ £ 0 2=5 (a* 0 , A'l , X 2 t • • i } 

is (he shortest path between .vq and a*. 


Proof . Consider any path 


t l C v o i X P:> x pt * • * * > % * x () 

between vertices xo and .\*|, where /r has been obtained by the index reduction 
procedure. Since G is strongly connected, at least one path \i must exist. 
Then 

Apj Ao ~ Aji, — 0 <r /(.vo, a*j}j) 

Xpt — - X Pt < l{x Plf Xp s ) t 
'V - 1 — Ajt^, — Ar — A Pa < I(x Pa , X()i 


summing these terms 


Af ~0 < /(p). 


Since Af W(/io) for the path /to, then /to is a shortest path, and the proof is 
completed. 


Denote Gs, = (S T , tA.v,)* as a subgraph of G. Let be the 

subgraph adjacent to Gs t : then G.^ ~ (Sy , (Jc ). Define the awgwfTi/c^/ 
subgraph by G< # =-- (51, FT), where 

.91 ~ .9, u v; v g [5 y n Fx], 

t*5, . 



408 


CHARLES W. BLUMENTRITT 


Thus G' Sz consists of the arcs connecting vertices of S x in addition to those arcs 
incident from S x into S y . We note that G Sv may be partitioned such that * 

Sy— {Sly ..., Si, i^x, Si^ 0. 

Theorem 6. For the subgraph G Sx = (S Xy U Sx ) of the strongly connected 
graph G = (X, U) let G$ z = (S x , Us x u Ugf) be the augmented subgraph of 
Gs z . If there is associated a length parameter l(u) > 0 with each arc u = {x A 
x e S x , G S’ x , then a path \s! can be determined which contains at least one arc 
betzoeen x 0 e S x and a vertex x' n e {S x — S x } such that the length of // is minimized 

Proof We use Theorem 5 to show that a path of minimum length can be 
determined within Gs x between x 0 and x r , if x r is connected within G s by a 
path from x 0 . It remains to be shown that for xo G S x there is at least one 
vertex x' n g {6^ — a?*} connected by a path from # 0 . 

Since Gs z is a subgraph of the strongly connected graph G = ( X , T), then 
S z ^ 0 and must contain at least one vertex; but subgraph G Sx is adjacent 
to at least one other subgraph, and there exists at least one x' t e {&£-*$*} 
because G is strongly connected and by Theorem 3 Ujg x ^ 0; hence a path 
exists from x 0 to x\. 


3. MINIMUM PATHS THROUGH PARTITIONED GRAPHS 

The problem of finding a minimum path between a specified vertex of a strongly 
connected graph G = (X, T) and all other vertices of G may become unwieldy 
when the set X is large. An example is when the description of the graph 
suitable for machine processing exceeds the handling capabilities of the cal- 
culating device. In this case it may be desirable to work with components 
of G as presented in the following problem: consider the task of finding the 
minimum path from a vertex within a subgraph Gs of G = (X, U) to all other 
vertices of G, given that only the vertices and arcs within a particular augmented 
subgraph G$ are available during any stage of the computation. 

As mentioned by Blumentritt [2], the method of partitioned index reduction 
is given as a feasible construction of the graph with minimum length connecting 
properties from a specified vertex Xo . 

Algorithm 2 

1. Partition the strongly connected graph G = (A, U) into subgraphs 
Gs m = ( S m , U Sm ). Recognize that each subgraph Gs m has associated with it 
an augmented subgraph G's . 

2. Associate with each vertex Xi g X an index X( , where A* represents the 
distance of reaching vertex x% from a particular vertex XqE S x . 

3. Initialize Ao = 0, A; = oo, j ^ 0; let Xi = xq . , 

4. Scan the augmented subgraph Gs x for an arc (. xi , xj), X{, xj G S x sue 
that Xj — Af > (xt , Xj). For each arc that satisfies this inequality, replace j 
with A* + l(xt , xj). 



PARTITIONED GRAPHS APPLIED TO URI1AN TRA NSPOHTAT ION PLANNING 409 


5. Continue this process of Step 4 for each X( from which a new Af has 
been determined until no A; can be diminished further. 

6. For each «S W e {*% ^ r let JST W = .SV and continue the process 

of Step 5 until no further A reduction is possible in G. 

In a manner similar to that shown for Algorithm 1, we note that there 
exists a vertex x fii adjacent to vertex xt such that X( — A r/j - » /(.v njt ay), because 
tlie last monotonic reduction of At was made from \ertcx Ay* . Wc let x T , s be 
the adjacent vertex to \„ t for which a monotonic reduction of A„ t was made. 
The partitioned index reduction procedure has caused the sequence Af,A n| * 
A ni ,..., , A n.t A rijt . .., Xd 3 1 Ao to be monotonic nonincreasing as it 

approaches Ao, and wc state and prove the related theorem. 

Theorem 7. For a graph G (A', U) } where a length parameter l(u) > 0 
is assigned to each arc it and a partitioning of G into n subgraphs is imposed , 
the application of the partitioned index reduction procedure to the augmented 
subgraphs G'< m of G yields the length of the shortest path from vertex A*o e S x 
to vertex .V/ e A* as Af. The path 

t 1 z= ~[ {( a ‘ o , x ( i i , Xft s , — ) £ } f 

A'n. » x n 3 > . * .) 6 S n } t 

{(••«» * r n. » - x r? t > Ay) £ }] 

is the shortest path between vertex Xq and vertex ay . 

Proof . Consider the paths 


/*o (a*o , Ay,, , xy ( . , . . .) G *Sjr , 

t L r< -- (A'n t A'n * ), AV» *2 * . . .) S Sri 1 
f l l ~ (• • * » A > AV,. Ay) £ *Sjt , 

where /F ~~ (/£» ^ ~ /th 4 * pi) is n path between vertices a* 0 and xy 

obtained by the partitioned index reduction method. Bv Theorem 6* and since 
G is strongly connected, at least one path ft must exist. 

In general, consider the path sequence of vertices that falls within a par- 
ticular Gs«~~~(S nt Us J: 

fin (a’d j » Ar rn? , . . . » A'r?..,, € S n , 

Since each /(w) ^ 0 and each A/ > 0, the partitioned index reduction pro- 
cedure has caused the sequence 

Ar»;. , A-. r . t , . . . , 


to be monotonic nonincrcasing as it approaches X rit and 
A,^ < A n? < < A^.. s £ A«*. . 

'Then the sequence 

Ah 4 A., tl - A^ ; 0 <, A, t — - Ah. < * < Ah, j < Ah 

is also nior.otonic nonincreasing as it approaches Ah,. 



410 


CHARLES W. BLUMENTRITT 



Figure 1. 

If we now associate with each x m e/x m the new index A,'„, the application 
of Theorem 5 shows that fi m is the shortest path between x mi and x mr . Con- 
tinuing, we see that A* = /(/>i), hence fx is the shortest path and the proof is 
completed. 

An example of the method is given by graph G in Figure la, and the par- 
titioning into subgraphs Gs L> Gs 2 , G $ 3 , and G$ A with augmented subgraphs 
G^, G' Sz , G^y and G ^ is shown in Figure lb. Note that 



PARTITIONED GRAPHS APPLIED TO CRB AN TRANSPORTATION PLANNING 4fl 


/(I, 2) — 3 
/( 1.4)- 1 

Si = 

{!}; 

Si 

« {1.2,4 

1(2, 3) - 1 

So ^ 

{2}; 

s; 

= {2,3}, 

Z(3. i ) = 1 
1(3, 4) — 1 

S Z -= 

{3}; 

Si 

= {1.3,4 

/(4, 2) 1 

£1- 

{4}; 

Si 

= {2,4}. 


\Yc seek the minimum distance from vertex 1 to all other vertices. Initialize 
Ai ~ 0, A 2 -- cc, A3 -- oc f A< = co. Consider 

GU t : Ai —/{l, 2) - 0 ~ 3 < x, -A 2 - 3; 

Aj - /(I, 4) = 0 -r 1 < cc , ~*A^ = 1. 

Since 2eSn and 4e»S\«. t we consider next either G^ s or Gj$ . We arbitrarily 
choose G $ m : 

A 2 -f 1(2, 3) =. 3 ~ 1 < cc, ^A 3 = 4. 

Thus is due consideration, but first G >* 4 may be studied. 

A 4 -r /(4, 2) ^ 1 -f I < 3, -* As « 2 . 

Hence G$ must again be considered. Before this, however, G was in sequence 
and 

A3 4- /(3 t 4) = 4 -r 1 > h no reduction. 

Considering G'<r Jf 

A 2 -r 1(2, 3) « 2 4' 1 < 4, — A3 « 3. 

This reveals that Crjt must also be considered for the second time. 

A3 ~ /( 3, 4) =• 3 -r I > 1* no reduction. 

No further A reductions remain to be considered, either in or outside their 
respective subgraphs. Thus A3 *-3, A* — 2, Aj~l f represent the minimum 
distances from vertex 1 of {?£,. 

4. CONSTRAINED MINIMUM PATHS 

Certain applications of Theorem 7 require that considerable attention be given 
to the computational time required for determining a minimum path graph, 
even though the process is easily mechanized. In this case it is necessary to 
study means by which maximum use may be realized of each Vxe SI so that 
the time expended in iramfcrrinz the digital description of each G< € G is not 
large with respect to the time elapsed in calculating index reductions. When 
tlie graph is represents he of a physical system, for example a highway or 
street network, it may be convenient to choose cut -sets related to link crossings 
of pinskv! barrier*' (rivers, railroads etc.}. Thus it is conceived that in a 
vehicle traversing a Ica^i-time path from origin to destination multiple crossinp 
of these turner* may be minimized or constrained. 



412 CHARLES W. BLUMENTRITT 

4.1. Definition of Constraints 

If the strongly connected graph G = (X, T) is partitioned into n subgraph- 
Gs m such that the shortest path connecting any two arbitrary distinct vertice- 
of G uses at most one element of the cut-subset Cs between any subgranh- 
G Sx and G Sv , where 

Cs = {(*> y) {S x n S f), y e (S v n S'*)}, 

we say that the graph is 5-partitioned. Clearly not every strongly connected 
graph may be ^-partitioned except trivially. 

On the other hand, if the strongly connected graph G = (Z, T) is partitioned 
into n subgraphs Gs m and it is stipulated that at most one element of each 
cut-subset Cs , as described above, may be used in the sequence of arcs defining 
a path, we say that the path (if it exists) is Z?-const rained. 

Further, consider a partitioning of the strongly connected graph G = (X } U) 
arranged such that a subgraph Gs p — ( S p , Us w ) of G exists, S p c Z, where the 
augmentation of G — G s , by the cut-set Ug p V U$ p reduces the number of 
components of G by exactly one. Note that a partitioning disconnects a graph 
into various components, since in a partitioning all cut-set arcs are eliminated. 
If a Gs p exists for each successive augmentation of the components of G until 
the original graph G is obtained, we say the graph G is J3 # -partitioned. 

4.2. Constraint Problem 

If the strongly connected graph G = (Z, U) is ^^-partitioned into sub- 
graphs Gs m and a length parameter I(u) > 0 is associated with each arc u, the 
problem is to determine the 5-constrained path of minimum length from 
vertex xq e S x to all vertices of G reachable by ^-constrained paths. In a 
flow network, in which flow is expected to follow the path of least resistance, 
the cut-sets of the J3*-partitioned network correspond to a medium that has 
the property of allowing only unidirectional flow. 

The following method of modified partitioned index reduction is presented 
as a feasible technique for determining a ^-constrained path of minimum 
length from vertex xq e S x to all other vertices of the strongly connected graph 
C? = (Z, {/), given that G is B*-partitioned: 

Algorithm 3 

1. Step 1 of Algorithm 2 (restricted to H*-partitioning). 

2. Step 2 of Algorithm 2. 

3. Step 3 of Algorithm 2. 

4. Step 4 of Algorithm 2. 

5. Step 5 of Algorithm 2. 

6. For each S m n {5^ — S x } ^ 0 not already considered, let S m -S x 
and continue the process of Step 5. 

Theorem 8. For a graph G = (X, U), where a length parameter l{u)^0is 
assigned to each arc u and a B* -partitioning of G into n subgraphs Gs n « impose , 



PART I TI ONTO G RAP Ha APPI I ED TO URBAN' TRANSPORTATION PLANNING 413 

the application of the modified partitioned index reduction procedure yields the 
length of the shortest B-eons trained path (if it exists) framzertex Ao £ S z to zertex 
A/e -V as Ai . The path 

(i m [ho, ...}e S ft1 

(%i 

(.... 'n,» Xf)e.SV], hcmc-...^r, 

is the shortest B-comtraimd path hetneen icrfcx \f> and zertex a*. 

Proof. Consider nnv path sequence 

fi -- [(ao > x a r t - * ) e Sa * 

('6,. Afr,* \(, 3 , . ..) 6 

(• • • » x r t t Xf t , \/) 6 S r l 

obtained by the modified partitioned index reduction procedure, where \i 
exists between vertices ao and A/. Since G is /J # - partitioned ancf the sequence 
, was considered only once in that order, 

S a n 6 V n • • * nS r — f % 

and /d is //-constrained. Wc apply Theorem 7 to show that /d is a shortest path. 

5. APPLICATION 

The method of Theorem 8 has been utilized in an application of urban trans- 
portation planning as related to the Dallas-IT Worth Regional Transportation 
Study. By tuc of a process called traffic us diriment, sampled data of driver 
origins and destinations nmv Ik* applied ton simulated driver path to distribute 
travel patterns over a network in a meaningful way. The selection of the simu- 
lated driver path is by the determination of the minimum time path between a 
given origin and all possible destinations given that a finite travel time may be 
affixed to each link of the simulation network. The expanded data samples may 
then be * 4 loaded’* to the* network by lining the minimum time paths as a guide 
to obtain a simulated cumulative volume count for each link over which flow 
is indicated. Since the process of finding minimum paths i* tedious (though 
straightforward), extensive use lips Ken made of electronic computing equipment 
for tlie kw several \earv to perform traffic assignments on the minimum path 
principle. 



414 


CHARLES W. BLUMENTRITT 


I wrote a program for the IBM 7094 to accommodate the method of £*. 
partitioning for a maximum of four subnetworks, each of which could contain 
no more than 4000 nodes. To obtain the complete set of minimum ^-constrained 
paths over the resultant network a two-pass scanning procedure of the subnet- 
works was used. Since essentially vertical lines of partition were imposed on 
the traffic assignment network representing the area, which is about 70 miles 
from east to west and 40 miles from north to south, the scan was initiated from 
the leftmost subnetwork (west) and progressed through successive adjacent 
subnetworks to the east. If the four subnetworks were lettered in this direction 
as A f By C and Z), at the completion of the right scan the B-constrained 
paths shown by Table 1 were complete through the subnetworks indicated: 


Table 1 


A 

B 

C 

D 


AB 

BC 

CD 



ABC 

BCD 




ABCD 


for example, BCD indicates that a path originated in subnetwork B has passed 
through subnetwork C and is currently represented within subnetwork D, 
Final determination of all B-constrained paths was realized by the completion 
of the left scan, which originates from subnetwork C and proceeds toward 
subnetwork A. The loading of the network proceeded in a somewhat reverse 
manner until the origin of flow was reached. 

This process, as described, constitutes only a basic simplified phase of the 
overall assignment process for the Dallas-Ft. Worth Regional Transportation 
Study. Further discussion of procedures is beyond the scope of this article. 


6. ACKNOWLEDGMENT 

The application work as described by this article was conducted by the Texas 
Transportation Institute under Research Project HPR-1(5) 2 - 8 - 63 - 60 , sponsore 
by the Texas Highway Department in cooperation with the U.S. Bureau o 
Public Roads. I wish to thank those personnel of the Texas Highway Depart- 
ment who assisted in the delineation of the basic properties of traffic assignment. 
I am also indebted to Dr. Charles Pinnell of the Texas Transportation Institute 
for his encouragement during development of the technique and to Proessor 
B. C. Moore of the Texas A&M University Mathematics Department for Jus 
comments on the article. 




name assignment by a stochastic model 


415 


7. REFERENCES 

[1] C, Itacn, Theory of Graphs (English translation 1962 by Methuen Sc Co.), Wiley, 
New York, 1962. 

[2] C. W. Blumentritt, “ Some Contributions to Graph Theory and Network Analysis/ 1 
master’s thesis, Texas A&M University, 1964. 

[3] R. G. Busackeji and T. L. S vvrv, Finite Graphs ami Networks, McGraw-Hill, 
New York, 1965. 

[4] L. R. Ford, Jr., “Network Flow Theory," The RAND Corporation, P-923, 1956. 

[5] E. F. Moore, “Shortest Path Through a Maze,*' Proc, Inti. Symp. Switching 
Circuits, Harvard University, 1957. 


PLUS COURTES DISTANCES PAR LA METHODS DES 
GRAPHES CLOISONNES AVEC UNE APPLICATION 
A LA PREPARATION D’UN SYSTEME 
DE TRANSPORTS URBAINS 

RfiSUMfi 

On decrit unc technique qui trouve la route In plus courtc a travers les grands 
rcscaux. En cloisonnant Ic reseau, on manifeste que chaque sous-rcseau qui 
rcsultc petit se reduirc aux voics sous-optimums jusqu’a ce qu’on a attcint la 
convergence, un point ou existent les vraies voies minimums. On presente 
unc methode pnrticuli&rc dc cloisonnement a fin d’obtenir des voies contraintes 
pour unc application specinlc. 

L J auteur a rcdtge un programme computatcur pour d^crirc unc version dc 
l'nlgorithmc qui sort a trouver les voies mini mums par les r&eaux dc rues. 
Ce programme a servi cormnc unc phase d'un systemc d'affectation de cir- 
culation en usage courammcnt pour des rcscaux jusqu’a de 16,000 croisemcnts 
dc clicmins. 


TRAFFIC ASSIGNMENT BY A STOCHASTIC MODEL 
V Affectation du Trafic par un ModeJe Stochastiquc 

IIasso von Falkenhausen| 

I nst it at fur Pruktischc Mathematik 
Tcchnhchc Hocks chute 
Darmstadt , 1 1 'at G ennar.y 


I. ANALYSIS OF THE PROBLEM 

The traffic demand for a road network No is defined hv a matrix V , in which 
t{a % z) equals the number of traffic units moving from origin a to destination r 

f Now with McKinley A* Company, Inc., KonigtaHee 9S, Dihsddorf, West Germany. 



416 


HASSO VON FALKENHAUSEN 


in a specified period of time. Realization of the total demand V results in the 
actual traffic volume to be observed in each branch of iVo . For a projected road 
network N± the expected traffic volume in each branch has to be determined b » 
an operation with V on N\. This operation is called “ traffic assignment’’ 
Its accuracy is tested in the existent network: the assignment operation is 
performed with V on iV 0 . If differences between observed and calculated 
traffic volumes are insignificant, the operation is considered to be a true repre- 
sentation of the actual realization of demand. The quality of an assignment 
operation is measured by a parameter t: 



where m — number of branches in which the actual traffic volume has been 
observed, 

ve(i ) = observed traffic volume in branch /, 
v(i) = calculated traffic volume in branch i. 

The road network N is defined by nodes, branches, turn-off penalties, and 
capacity constraints for branches. For city networks branch lengths as well as 
turn-off penalties are usually given in time units. 


2. GENERAL OUTLINE OF AN ASSIGNMENT ALGORITHM 
An assignment algorithm usually consists of two steps: 

1. Finding “ acceptable ” paths from a to z for all v(a y z) / 0. 

2. Assigning a percentage of the v[a y z) traffic units to each acceptable 
path found in Step 1. 

Several methods have been developed for finding the n best loopless paths in 
a network (see References). It is rather difficult, however, to define criteria for 
discarding all those paths that are likely not to be selected by a driver. In 
the network shown in Figure 1 there are three paths leading from A to Z\ 

1. A — B — C — F — K — Z, length 60, 

2. A — B — C — D — E — F — K — Z, 68, 

3. A — B — G — II — K — Z y 70. 




TRAFFIC ASSIGNMENT BY A STOCHASTIC MODI1 


417 


Path 2 is nn unlikely choice for the driver, since between C and F the route 
through D and F means a 200 per cent increase in driving distance over the 
optimal path. Path 3 is an acceptable alternative, with only a 25 per cent in- 
crease between B and K. 

For calculating the percentage of z(a, z) to be assigned to a path a number of 
formulas have been developed [11]. Most of them are based on a modified 
version of the KirchhofT law (distribution of an electrical current on parallel 
conductors). This approach has been found to be a somewhat unsatisfactory 
model in a number of studies. 

3. THE STOCHASTIC MODEL 

When a driver makes his choice of a path from A to Z , he is influenced by n 
number of factors such as estimated path length, traffic density encountered 
on his trip, inconvenience of an otherwise short path (tramway traffic, many 
left turns), and driving habits. Each driver weighs these factors differently and 
makes his decision for the “optimal” path, optima! according to his standards. 
Obviously each driver will arrive at a different estimate of the path length. 

In the stochastic model the length of a path is considered to be a random 
variable whose distribution is called 44 estimating function.” Mean and vari- 
ance of tlie distribution arc assumed to depend on the true path length. In my 
thesis [5] I discuss several approaches for finding suitable estimating functions. 
In a number of test calculations the logarithmic normal distribution yielded 
the best approximation of calculated traffic volumes to values observed in 
traffic counts. 

Model A . The driver estimates length // of path i between the turn-off 
node from and the return node to the optimal path. 

Model B. The driver estimates each branch length dfj of path i between 
the turn-olT node from and the return node to the optimal 
path. 

Once the parameters of the estimating function are known, the model can be 
made part of nn assignment algorithm in two ways: 

1. In a Monte Carlo simulation the estimated best path is calculated n 

times for all 0. Each time (I ,«}r(<7. z) traffic units are assigned to 

the estimated best path. 

2. After determining h acceptable paths between a and e the zip, z) traffic 
units are assigned to the h paths according to the selection probability of each 
path. 


4. DETERMINATION OF PARAMETERS 

For determining the values of the relevant parameters the following notation 
»s used: 

// true length of path ? between the turn-off node from and 

the return node to the optimal path. 



418 


HASSO VON FALKENHAUSEN 


Xi 

f( X t) 

pi 

<11 

LN(m t , s t ) 


estimate of /*. 

estimating function, probability distribution of x it 
selection probability of path i; 
pt = P (path i is estimated to be the shortest path), 
percentage of the total traffic volume v(a, z) observed on path i 
logarithmic normal distribution of the variable x t with para- 
meters ni{ and S{. 


LN(iiii , si) — (Vln st nti) 1 exp 


(In Xj — In m t y 


2s? 


where m\ is the median and st a measure of dispersion. 


It is reasonable to assume that the true length k of path i is the median of the 
estimating function of Xi . If the limits for x% are set to be 


^ h < x i < Mi , 


with a probability of .9973 (this corresponds to the three sigma limits of the 
normal distribution), the value si depends on k: 

h <> x t < kl t ^ = .9973, 

P(ln If — \nk< In Xi < In k + In h) = .9973. 

Since yi = ln Xi is distributed IV (In h , st): 

3 st =a (In h + In k) — In It , 

In k 

H = —=s. 

The unknown value of k can be estimated from observations in the existing 
network. In Figure 2 n paths lead from A to Z. The true length of path i to 
be considered in this calculation is the distance B —Ei — C. Of the total volume 




TRAITIC ASSIGNMENT BY A STOCHASTIC MODEL 


419 


v[a, a) moving from A to Z 2 percentage qi has been observed on path i. The 
selection probability pi of path i can be determined by a multiple integration 
over the estimating function. If, for example, n — 3, then p 2 can be calculated 
as follows: 


pz ^ p (path 2 is estimated to be the shortest path), 
p 2 ~ P{x 2 < x x r\ x 2 < a* 3 ). 

For independent estimates X( 

. cr. - x „ x 

pz — I /(-V2) | f[x\) dx\ I /(.v 3 ) dxz dxo, 

* SC * Xi * ?l 


pi = f * LN{k ,t) f * LX(h , S) dxi f X LN(h , s) dx z dx 2 . 

* - cr • xi * 


With yf = ln.Vf, 



* - <T, * * l'l 


If approximation formulas as given in [8] are used for the integrals over /(.Vi) 
and / (.V3), the value of pz can be found by numerical integration. The method 
described in [12] proved to be efficient and accurate. 

In the integrals the parameters 4 are known. The unknown value of s can 
be found by regarding qi as an estimate of />/. From a traffic poll in Wurzburg, 
West Germany, 397 observations for qi with n total of 14,340 traffic units were 
available. The maximum number of paths selected by drivers for going from A 
to Z was 5. For each observation 5 ^ 7 , z), qi was calculated for path f. A value 
h was found by setting this integral equal to qi . Each value k was weighted by 
the number of traffic units of the underlying observation. The sample mean 
k and the sample standard deviation .u- arc given below: 

Model A: h - 3.42, j* ^ 0.26, 

Model B: f~ 3.69, $i^0A7. 


For model B the formulas for pz have to be slightly changed. Because of the 
central limit theorem, the path length 4 is approximately normally distributed. 
Mean and variance of 4 have to be calculated from the means and variances of 
all branch lengths d\$ which are part of 6. 


5. ASSIGNMENT BY MONTE CARLO SIMULATION 

The stochastic model described has been applied to urban traffic studies in 
several major cities in Wcm Germany. The estimated optimal path from otoa 
for each r(u, z) / 0 was calculated n times. Each time, (I 'n)v{a, z) traffic units 
were aligned to the path found. The value of n was determined bv test calcula- 
tions: k was considered to be sufficiently large when the mean load on a branch 
st.ncd within given tolerance limits from iteration rr — 1 to r:. It was found 



420 


HASSO VON FALKENHAUSEN 


that 5 < n < 10 was sufficient for the accuracy required in the study. "Unac- 
ceptable” paths were discarded automatically by the optimal path algorithm 
because of their small selection probability. The results from using this model 
in assignment calculations for a number of West German cities have been 
highly satisfactory. In test calculations for the city of Wurzburg the following 
values of t (the square root of the mean-square deviation between observed and 
calculated traffic loads; see the first formula in this article) were obtained: 


k 

Value of t 

Model A 

Model B 

0 

380 

380 

2.5 

199 

218 

3.0 

198 

191 

3.5 

134 

158 

4.0 

191 

207 


Setting k — 0 is equivalent to using the optimum path assignment method. 
For details of the calculations see [11]. 

6. ASSIGNMENT BY AN ANALYTICAL METHOD 

By means of the stochastic model the selection probability pi of path i can be 
calculated as outlined. In Section 3 the use of pi in an assignment algorithm 
was mentioned. For each of the n acceptable paths between a and z the expected 
load from demand v(a, z) is found to be pi v(a , z ) traffic units. A number of 
algorithms are available for determining the n best paths in a network like 
those described in [2], [6], and [10]. The main problem is, however, the pro- 
cedure of discarding all those paths that appear to be u unlikely” choices. One 
criterion for an unacceptable path could be a selection probability pi<p m im 
where p m \ n is a given threshold. From test calculations it could be concluded 
that pm\ n X 0.1, An assignment algorithm based on this analytical approach 
will lead to estimated savings in computer time of about 30 to 40 per cent over 
the assignment method discussed in Section 5. 

7. REFERENCES 

[1] C. Berge, La Theorie des Graphes, Paris, 1958. 

[2] S. Clarke, A, Krikorian, and J. Rausen, Computing the N Best Loopless Pat 
in a Network. J. Soc . Indust. Appl- Math ., 11, 1096-1102 (1963). 

[3] E. W. Dijkstra, A Note on Two Problems in Connexion with Graphs. Numertsche 

Mathematik 1, 269-271 (1959). „ , , , AfV 

[4] H. VON Falkenhausen, Distribution of Vehicular Traffic on a Road Net • 
p. 718 (Summary) in Proceedings of the IFIP Congress 1962, Amster am 

[5] H. von Falkenhausen, Ein stochastisches Modell zur Verkehrsumlegung. 

sertation, Darmstadt, 1966. . . T oeaf 

[6] H. von Falkenhausen, Berechnung der n besten Wege in emem Netz. P 
in Ablauf - und Planungsforschung, 8 (1967). 



421 


OPTIMIZATION DES PROJETS ROUTIERS 

[7] A. Haiti, Statistical Theory with Engineering Applications. New York, 1952. 

[8] C. Hastings, Approximations for Digital Computers . Princeton University Press, 
1955. 

[9] P. A. MAckk, Das Prognoses crfnhrcn in der Strasscmerkehrsplnnung. Wiesbaden- 
Berlin 1964. 

[10] M. Poi.t-AK, Solution of the ht\\ Best Route Through a Network — A Review. 
J. Math. Anal. Appl . , 3, 547-559 (1961). 

[11] G. STCtnm'ALD, G. Scholz, and D. Haupt, Prohlemc der Verkehrsumlegung. 
Strasse taut Autobahn 14, 469-480 (1965). 

[12] II. Waldschmid r, Numcnsche Integration nach clem Romberg Vcrfahren. 
Programrnbeschreibung, Ucchen/entrum dcr TH Darmstadt (August 1965). 


L’ AFFECTATION DU TRAFIC PAR UN 
MODELE STOCHASTIQUE 

r£sum£ 

Lc processus dc decision d’tin conductcur qui depart de point a pour allcr a sa 
destination Z dans un rescau routicr cst decrit par un module aleatoirc, D’ a pres 
des calculations preliminaircs, les estimations du temps de parcours entre deux 
points du rescau sont distributes scion la distribution log-normnle, Les valcurs 
nctuclles des parametres de la distribution sc sont ensuites dcs enquetes dc 
circulation. Lc module aleatoirc cst incorporc dans un algorithme pour calculcr 
lc volume de circulation expccte dans chaquc arete du rescau. L*algorithme cst 
applique ou dans unc simulation Monte Carlo ou dans une calculation analytique 
qui utilise la distribution log-normale pour determiner la probability dc selec- 
tion pour chaquc chcmin. Les deux methodcs sont decritcs, et des resultats dc 
calculation sont discutes. 


OPTIMALISATION DES PROJETS ROUTIERS 
Optimum Design of Highways 

J. L. Grobou.lot 
Ittgtrticur it Francorclab 
J. L. Dtugny 
Ingetticur des Fonts vt Chaus sees 

Sendee Sprdnlr des Automates du Ministerr dr VKquipnncnt, Paris 

France 

l. LES PRODLEMKS D’OPTIM ALIS ATION DE TRACES 
1.1* G£nernliles 

I/optimalisation d*un trace re\ct des aspects difierents scion: 



422 


J. L. GROBOILLOT ET J. L. DELIGNY 


— le stade de conception du projet, 

— les degres de liberte autorises, 

— les criteres d’optimalisation adoptes. 

Le stade de conception du projet peut aller de 1 etablissement d’un ph n 
directeur d’amenagement d’un reseau routier national aux etudes de detail d» 
transport des terres sur le chantier meme, en passant par la realisation des 
a vant -projets. 

Les degres de libertes peuvent etre tres nombreux pour une autoroute de 
liaison, et, au contraire, tres limites en zone urbaine. 

Les criteres d’optimalisation peuvent se baser soit sur des consideration? 
purement techniques d’ecoulement des flux de vehicules, soit sur des notions 
de rentabilite de travaux sur le plan national, soit sur des notions de moindre 
cout de construction et d’exploitation. 

Les methodes qui sont presentees ici ne traitent evidemment qu’un aspect 



partiel des problemes generaux d’optimalisation. Elies se situent au stade de 
la conception d’un projet limite dans le temps et dans l’espace au cadre d une 
operation deja prevue au plan directeur na